BOFH: Have you tried forcing an unexpected reboot?

Schrödinger's firmware and the refreshing power cycle

BOFH logo telephone with devil's hornsEpisode 6 There are few things more annoying to an IT Professional than applying a firmware update that crawls to a stop at 83 percent. Luckily, today, I have one of those more annoying things right here with me. The Boss is peering over my shoulder.

If there's one thing that can make a bad situation worse, it's an idiot making suggestions like "Perhaps we should switch it off and switch it back on again?" – as if they were the person that invented the Magic Power Cycle.

For, as every seasoned IT Professional can tell you, there is one situation where the MPC should not be used - the firmware update. Many an expensive piece of hardware has ended up as a shelf ornament due to a MPC.

I look at the monitor, still 83 percent.

Which brings me to a personal bugbear - progress indicators.

Back in the old days, a progress indicator was magical gibberish like "FLASHING BIOS BACKPLANE," which only meant something to someone who knew WHAT the bios backplane was and WHERE the bios backplane flashing featured in the update procedure. Without that knowledge the continuous cryptic progress indicators served only to inform you that the system hadn't crashed. Yet.

We had a major leap forward with the "-\|/-" spinning wheel, which was like animated magic – but again served only to tell you that something was still running.

Finally, someone introduced the idiotic percentage indicator, configured to use the least useful counter scale. Copying 1,000 files of 1K each, plus 1 file of 8 terabytes? Use number of files as the indicator, and watch it hang for ages at 99.99 percent. Applying a swathe of firmware patches? Just use the patch number, regardless of the wildly varying response times.

"Ooh, it says it's verifying," the Boss says.

And then there's the "real time" progress indicator – with a countdown showing how long something thought the upgrade will take, ticking down until it gets to zero, then negative numbers. Who knows if it's completed or not?

And all the while you can see the access LED on your USB stick still flashing irregularly...

"It might be verifying, it might not," I say guardedly. "63 percent," the Boss says for the benefit of those of us who can't read. "...75 percent."

I let him continue with the percentage reporting as it prevents him from vocalizing his other thoughts.

"83 percent," he says excitedly.

...

"Still 83 percent," he says, a couple of minutes later.

...

"Should it still be reading 83 percent?" he asks, after about five further minutes. ...

"It looks like it's stopped at 83 percent," he informs me, ten minutes after that.

"Just..." I say, not wanting to jinx it.

"Okay, it's definitely stopped," the Boss says, at the 40 minute mark.

"It certainly looks that way," I admit.

"So... Turn it off and turn it back on again?"

"Hmmm..." I mumble.

"?" the Boss asks.

"We have here a Schrödinger's firmware situation," I explain.

"Huh?"

"The box may be dead, or it may be alive - and philosophically speaking it is both dead and alive. However, once I power cycle it, it will be one or the other – and most likely dead."

"How can it be dead – it was verifying?"

"It was verifying, but it didn't complete verification."

"So what will you do?"

"I've no choice. It's time for Quantum Superposition."

"By... using the quantum computer chip?" the Boss asks, having, no doubt, been half-reading the technical section of the newspaper in the toilet again.

"No – I'll just swap places with a service engineer," I say, reaching for the phone.

And a service call is made. Luckily the hardware concerned is new and is still under warranty. Luckier still is that the firmware patch is both a critical and mandatory upgrade, which closes that loophole. Reluctantly, they agree to send an engineer.

"What is this, the Model 150-HA or the 150-HAL?" he asks when he arrives.

"The 150-HA," I say.

"It could still be the 150-HAL," he says. It's not on the label, but you can see it on the motherboard inside the case."

"What, and void the warranty?" I ask, smelling that trap.

"Okay, I'll just open the case and check," he says.

"And I'll just take a photo of you opening the case," I say.

"Everything seems to be okay in there," he says, looking up from a motherboard with absolutely NO visual indicators whatsoever.

"And yet..." I say.

"Have you tried powering it off and then back on?" he asks, much to the Boss's satisfaction.

"No; the upgrade notes said not to."

...

"Well, I guess we'll just have to..."

>click<

...

>clack<

... ... ... ... ... ... ... ... ... ...

"Hmm," he says, after a good ten minutes with nothing but a flashing cursor showing on the monitor.

>click<

... ... ...

>clack<

... ... ... ... ... ... ... ... ... ...

"Okay." he nods

>click< >clack<

... ... ... ... ... ... ... ... ... ...

"Uuuhm."

>click-clack<

... ... ... ... ... ... ... ... ... ...

"Okay then," he says decisively.

>click-clack< >click-clack< >click-clack< >click-clack< >click-clack<

... ... ... ... ... ... ... ... ... ...

"I THINK," I say, "that we should abandon our tests of the MTBF of the power switch and just get that unit replaced. Under warranty."

"Oh, sometimes this works," our engineer insists.

...numerous clicks, clacks, and a burning plastic smell later...

"Well I think maybe we'll need to ship this back to the repair team," the engineer says. "Maybe they can just replace the bios chip."

I point out that I'd prefer to have a new replacement unit, with the mandatory critical update installed, rather than a unit which has been power cycled more times than one of Arne Larsson's pacemakers.

"Yeah. Well, we don't actually have any spare units," he admits. "We've had a number of these failures in the past month, so..."

"So?"

"I... uh..."

"So you were switching it on and off because?"

"Sometimes it works."

"So you said. But you also knew that it doesn't work?"

"I... uh..."

"So we'll be getting a replacement?"

"Uh, yes. I guess so."

Ten minutes later I get a phone call.

"It's our engineer," I say to the Boss. "He's stuck in our elevator."

"Have you tried switching it off and back on?"

"No. But I certainly intend to. After I check there's no firmware update I can apply to it. Maybe I'll keep trying the power cycling - because sometimes it works."

The scared shouty voice coming out of my phone is silenced as I end the call...

More about

More about

More about

TIP US OFF

Send us news


Other stories you might like