This article is more than 1 year old

Exsparko-destructus! What happens when wand waving meets extremely poor wiring

You killed my data centre, prepare to die

On Call Welcome to another edition of On Call in which a contractor's shonky job and a guard's Jedi-like abilities result in an impromptu pager party.

Our story, from a reader Regomised as "Andrew", takes us back to his time working for a certain telecommunications company, "still well known in the ISP/Telco hardware world," he said," but not as big as they used to be."

Andrew's workplace enjoyed an on-site data centre with all the bells and whistles one could hope for. HVAC, UPSes – it was not short of the initialisms and acronyms so beloved by the IT world. It also featured the usual Emergency Power Off (EPO) buttons and actual humans to prowl the corridors at night to check all was secure.

"To keep everyone honest," explained Andrew, "the guards would carry a wand that would read RFID tags at various locations to record the time, date and position." Thus management would know that the guards were doing their rounds.

The data centre contained all manner of exotic gear, including a Sun Enterprise 10000 (or E10K) setup, which carried a price tag usually ending in lots and lots of numbers. As such, the guard had to pay a visit to the data centre floor to verify everything looked ok.

To ensure the guard actually went into the data centre, the RFID tag was on a panel away from the door. This panel also contained HVAC controls and alarm relay hardware. Oh, and also an EPO button, which was sensibly covered and sealed.

You can probably guess where this is going.

Going forward to the night in question…

Doubtless making swooshing sounds, the guard wielded his RFID wand like a lightsaber, bringing it down on the tag with an almighty thump. The contents of the panel rattled disapprovingly and the RFID tag was registered. The guard continued on his rounds, oblivious to the mayhem his exuberance had unleashed.

Since nothing had apparently happened, nothing was reported. And nothing was reported in the logbook either. However, something had happened, and every pager for everyone on-call (including our hero) went off. Sun's hardware had been abruptly shut down. Attention was demanded and, bleary eyed, the SysOps and Networking teams rushed to the aid of the suddenly stricken system.

"By the time I got there the lights were back on and every server was sitting waiting for FSCK."

But what had happened? Once things were back online the guard was questioned and swore that he'd only administered a sharp rap on the RFID tag. Logs were pulled and pointed to the EPO being pushed "but the covers around each EPO button were still in place and the seals were still intact."

"How could this be?"

It transpired that the RFID tag had been stuck to the panel housing the EPO relay. "With great caution the cover was removed to find... the sketchiest relay install known to man."

The relay itself was suspended in mid-air, only supported by wires from the EPO and PDU transfer units. A contractor had installed it, thought "job's a good'un," closed up the panel and left it, dangling, waiting for a guard to pretend he was Luke Skywalker or Inigo Montoya and thump the tag on the beige case, thus shorting out the wires.

The fate of the contractor is lost to the mists of time (we suspect that with such competence and attention to detail, he must be a political advisor by now.) Andrew wasn't sure what became of the RFID panel that triggered an On Call pagerdemic either.

However, in terms of data centre disasters, he noted: "Not all buttons are red but beige boring panels can easily do the job."

We've all had to deal with the results of somebody else's poor handiwork, but have you been bitten on the behind by the leavings of a shonkmeister? Or were you the leaver of the shonk? Let us know with an email to On Call. ®

More about

More about

More about

TIP US OFF

Send us news


Other stories you might like