Who, Me? Bid farewell to the festivities of the weekend with a story of self-inflicted pain in our weekly Who, Me? column.
"Don", as we shall call him, was reminded of his own database-related dark night of the soul by our recent tale of telco terror. He got in touch to confess all to the sympathetic vultures of The Register.
Don had been working on a new system to replace some elderly in-house applications and services used for sales/order processing and CRM duties. In today's integrated world, the minty-fresh platform would also serve as the back-end for a new product.
Enthusiasm is the mother of mayhem, and Don explained: "The back-end for the new trial product was already live and had some trial customers using it, but the main part of this system hadn't launched internally yet."
The pressure to get things up and running was intense. Naturally, there was an old system with data that had to be dragged kicking and screaming into the new world. Don told us: "To say that there was some crunch in the run-up to live day is an understatement. We'd been working 60-70 hour weeks for three weeks straight in an attempt to get this new platform ready."
Idle Computer Science skills are the Devil's playthingsREAD MORE
Don, who described himself as "the lead database bod on the team", had the task of making the migration work. The process "worked, but it was too slow" and, of course, after every lengthy test run the environment had to be reset.
"We'd obviously been at this for a very long time," said Don, "so in the end I started to get lazy.
"Instead of doing a full environment reset from a set of clean backups I just used a
TRUNCATE ... CASCADE statement on one of the primary tables."
Having run the code, Don gave his counterpart on the old system the nod to kick off another lengthy test run and headed out to do a bit of shopping, as you do.
Upon his return, Don found a Slack message waiting for him. The test had failed.
Don considered the problem. "The simplest explanation was that I hadn't truncated the tables," but he distinctly remembered running that
TRUNCATE command... "Unless I'd done it in the wrong environment."
The next question in the sphincter-loosening fuck-up flowchart was: "Wait... which database cluster was I connected to...?"
Not the one containing all the live customer trial data? Surely not?
For those unfamiliar with the joy that is PostgreSQL,
TRUNCATE wipes a table. Adding the
CASCADE option will wipe linked tables. The PostgreSQL documentation helpfully warns: "Be very careful when using this option, or else you might lose data you did not intend to!" Unlike the slower
DELETE FROM function,
TRUNCATE doesn't record its nefarious activities in the transaction log, so rolling back isn't an option.
Don used the phrase "gut-swooping panic" [I like that a lot – Ed] to describe the sensation of realising he'd just nuked the production cluster from orbit, with no way of rolling back the damage.
However, rather than frantically trying to pin the blame on a nearby PFY, Don did the right thing and confessed his misdeed. The on-call team were warned that there would be some "downtime", Project Leads were informed and Don spent the next six hours "painstakingly restoring a point-in-time backup from just before I issued the fatal command".
"I eventually got everything back somewhere just after midnight, just before the Japanese users were about to start coming into the office."
"Of course the piss-taking from the team was inevitable," Don told us. However, he had owned up and dealt with the fiasco himself, "so my peers at least saw it as a sort of badge of honour – an admittedly turd-shaped badge, but a badge nonetheless."
And management? "All they wanted to know was a revised ETA on when this job would be fixed, with the unspoken-yet-very-heavily-implied message that 'the answer I'm looking for is right bloody now.'"
The moral, according to Don, is: "When you cock up, own it: admit the mistake and then work out what you need to do in order to fix it."
Oh, and: "'Crunch' is a stupid concept that only leads to overworked and burnt-out developers."
Ever earned your own turd-shaped badge of honour? Clear your conscience and share it with the never-judgmental Register crew.
For those wondering what became of last week's contributor, Charles, the story ends with him "enjoying the student union and all its grubby delights", having been banned from using any Computer Science facilities and thus pretty much guaranteed a fail that year.
Chastened by the experience, he went back to a different university years later as a mature student and picked up a first-class honours degree. ®