This article is more than 1 year old
Nobody would ever work on the live server, right? Not intentionally, anyway
Techie installing an upgrade did everything right. But the user was already wrong
Who, Me? Greetings and felicitations, dear reader-folk, and welcome to the soft landing to the working week that we at The Reg call Who, Me? in which we share tales of readers like you who found themselves in unfortunate circumstances – often of their own making.
This week's tale is a lamentable one, to be sure, as it is a story of that oldest character flaw: overconfidence. Who among us can honestly say they have never fallen victim to that same boogeyman? Our hero we shall call "Han", that's who.
Han worked for a company that supplied Unix systems to a firm that processed photographs from film.
Younger readers may not believe it, but before smartphones came along cameras used "film" - thin plastic ribbons coated in light-sensitive substances. After taking photos, camera users took those films to a retailer that sent them to a remote laboratory in which arcane processes turned them into images on paper.
Anyway, that was the sort of company Han was visiting on this particular fateful day. The computer room was on a mezzanine floor, with a window overlooking the factory floor where staff busied themselves processing orders and printing pricing labels. Dozens of dot-matrix printers buzzed and whirred constantly, and the noise was deafening even in the lofty heights of the mezzanine.
There were three servers in the computer room: Live (which ran the terminals), Backup/Test (in case Live fell over), and Reporting. There is wisdom in that configuration, obviously. You should always have a backup, and the existence of the Backup server gave Han confidence.
This fateful day, Han was to install a software update. But, of course, he was not going to install his update on the Live server. No, that would be foolish. His plan was to install the update on the Backup/Test server and, you know, test it, before making any changes to the Live system. This was sensible, and Han was rightly proud of his plan. It gave him confidence.
- Linux lover consumed a quarter of the network
- Network died, hard, during company Christmas party, leaving lone techie to fix it
- Turning a computer off, then on again, never goes wrong. Right?
- Hacking a Foosball table scored an own goal for naughty engineers
Han left home early to get on with the job. When he arrived he confirmed with the manager on duty that staff were aware of what he was doing. A few staff were using the Backup/Test server, but he should be OK to shut it down in about 15 minutes.
Armed with the knowledge that he would not be disrupting anyone's work and everyone was fully informed of his disruptive plan, Han set up to work in the computer room. He went to the backup server and typed in the command to shut it down in 15 minutes:
>shutdown -h -y 15
And with that, he felt confident he would have time to head to the factory canteen and get a bit of breakfast before he had to start in earnest on the upgrade. A bacon and egg roll, perhaps?
As he left he could see that the users on the Test server had received their warnings that the server was about to shut down, and this only added to his confidence. All was going well.
You can imagine, then, the sense of deflation he experienced a few minutes later when, while munching away on his breakfast, he heard the sound coming from the factory floor change. The buzzing and whirring ceased, replaced by the excited and confused exclamations of workers whose computers had mysteriously ceased to compute.
The manager appeared at the door. "What have you done?" he pleaded.
Well, it seems that at some point in the morning the Live server had encountered a fault, and the system – as it was designed to do – had fallen over to the Backup/Test server. And Han, in his overconfidence that everything was going swimmingly, had failed to check in case something was not.
As a result, he had shut down the whole business – the very thing his carefully laid plan was supposed to avoid.
It took half an hour to get everything back up and running. Thirty minutes in which Han learned an important lesson: Don't get cocky.
Have you ever been so sure you were right, only to find out you were oh so wrong? Tell us about it in an email to Who, Me? and we'll lend your story a sympathetic ear.