To avoid disaster-recovery disasters, learn from Reg readers' experiences

Nobody’s tested the tapes this decade, thinks to back up the Recycle Bin, or takes care when using rm

On Call Special How can you avoid a disaster recovery disaster?

You can find answers in the pages of The Register, specifically our reader-contributed tales of tech support triumph and terror: On-Call and Who, Me?

Close up of tangled tape

I was told to make backups, not test them. Why does that make you look so worried?

READ MORE

We've carefully reviewed both columns, by hand, plus perused unused submissions in both columns’ inboxes, and distilled them into the following causes of disaster recovery fails:

  • Whatever backup equipment and media you use has been ignored for years and is now incapable of reading and/or writing data; you therefore lack effective and/or recent backups to restore from. You will discover this after a disaster.
  • Your users have stored data in places you don’t protect, including tempfile directories and the Windows Recycle Bin. When you cannot restore their data, you will be blamed for their errors.
  • You cannot restore backups on-site if you can’t access your office. At this point you’re thinking cloud backup can save you …
  • … but you are wrong because some users think cloud backup is a magic protection halo and will do things to nobble it.
  • Plans to migrate and/or update systems pay too little attention to data migration, and you may find yourself needing to recover data at the same time you are rebuilding hardware.
  • Colleagues who think they have mastered bulk data erase commands such as rm -rf have not mastered them, and will delete the wrong directory – or an entire drive at the worst possible moment.
  • Your network will choke when you need to perform an emergency restore.

The fix for all of the above is developing and observing proper backup and restoration processes, spending whatever it takes on infrastructure, on site and off site, that can securely store data and restore it at speed, then testing everything often and rigorously.

Of course, you knew that already. So do tech giants like Google and Cloudflare – both of which recently lost customer data.

Even backup software vendor Veeam recently lost some of its own data.

Those incidents tell us disaster recovery is hard. So hard, in fact, that even rocket scientists can't always get it right: NASA appears to still be restoring data from tape after its November 2024 server room flood destroyed several servers. ®

More about

TIP US OFF

Send us news


Other stories you might like