New IT boss decided to 'audit everything you guys are doing wrong'. Which went wrong
Pet consultant took down the datacenter in attempt to find other people's errors
On Call As the old saying goes, there are few certainties in life beyond death and taxes. But in this week's On-Call – The Register's regular reader-contributed tales of techies being asked to rescue the ridiculous – we shall consider another: new managers who needlessly change systems that work perfectly well.
The source of this story is a reader named "Scanlon" who once worked for a company that hired a new operations director. Not long after taking the gig, the new guy declared his intention to "audit everything you guys are doing wrong."
Which rather rubbed Scanlon up the wrong way because, as he told On-Call, "I don't pander to someone who has more ambition than technical nous."
Scanlon couldn't stop the new ops director running his eye over the company's tech, which at the time was mostly midrange systems as it moved away from mainframes.
When Scanlon arrived at the company, he found several IBM P570 systems in place. Curiously, they were being used in bare-metal mode rather than running virtual machines.
"The result was expensive tin that wasn't performing anywhere near capacity and there were plans to buy even more kit," Scanlon scoffed.
Our hero was sorting that out as the new ops director arrived. He had one remaining issue to address: a tricky SAN driver upgrade to get a third-party vendor's storage kit playing nice with the IBM servers.
Despite the upgrade not being complete, it was included in the audit.
I didn't receive a word of thanks and I was never compensated for my lost holidays
"I had an initial meeting with the new chap where he introduced me to an external consultant that he had engaged to audit our systems," Scanlon recalled. "I ran through the configs and pointed out that we were still dealing with a vendor to resolve the driver updates and that as a result there was a patch that we were holding off applying."
Scanlon reckoned other work could wait until the SAN was sorted, pointed out he was taking holiday the following week, and asked if the audit could wait until he returned. He left the room feeling his plan had been accepted, so headed off for his break.
On the first Tuesday of his holiday, a colleague tracked him down with news: the consultant had been given root passwords, commenced the audit in Scanlon's absence, and had run some scripts to commence his probe.
Those scripts took a while to run, which the consultant explained, naturally, on his way out the door for the day.
Not long afterwards staff noticed an eruption of alerts, followed by clients complaining they could not process transactions.
Which was why colleagues called Scanlon, who tried to triage the situation from afar.
Before long, the new ops director demanded he come to the office.
Happily, Scanlon was only five hours away by road and was decent enough to answer the call.
When he arrived, in the wee hours of Wednesday, he noticed all the servers had surrendered to a kernel panic.
A glance at log files quickly showed the script run by the auditor was the cause of the crash.
The script was buggy, but when the first server became unresponsive the consultant dismissed it as a mere pause while the script did its job.
So he ran it on more machines. Which also crashed.
Scanlon's assessment of the situation was that the script disconnected all the virtual servers from the network and storage, so the servers promptly hung.
- This can’t be a real bomb threat: You've called a modem, not a phone
- Cleaner ignored 'do not use tap' sign, destroyed phone systems ... and the entire building
- Don't lock the datacenter door, said the boss. The builders need access and what could possibly go wrong?
- Server broke because it was invisibly designed to break
Once he restored things to working condition, the SAN and network paths came back online, databases recovered, and business resumed.
Scanlon got back in his car and drove another five hours to resume relaxation with his family.
Is he still a little bitter about the incident? Maybe just a smidge.
"I didn't receive a word of thanks from the company, and I was never compensated for my lost holidays," he wrote.
"The one saving grace is that I remained with the company for several years after that, while the tenure of the ops director was significantly shorter."
"And my response to the company was quiet quitting for the next 12 years" – showing up to work without giving it his best effort.
Have new managers and their minions impacted your work, and seen you called out to manage their messes? Click here to email On-Call with your story, which it would be our privilege to consider for a mention in this space on a future Friday. ®