As we all know, the world of backup is changing, and not just in obvious ways such as the move to disk and cloud-based backup, the adoption of deduplication, the need to copy, back up and restore virtual machines, and so on.
First, flash memory and the wider availability of snapshots and replication means that other elements of the storage infrastructure are taking on more of the responsibility for data protection. Features such as replicated snapshots are now faster and better, albeit more expensive, ways to recover a failed system than traditional backup.
Then there are equally fundamental changes under way in the nature of the data that needs protection, where it needs to be fetched from, the level of protection it needs and what you can do with it once it is backed up.
In particular, the possibility of reusing backed up data offers opportunities to mitigate the rising cost of data protection, and potentially even turn it into an asset.
Innovations such as disk-based backup allowed you to retain existing processes, which usually mean scheduled backups, and to finish the job faster – or, in this world of spiralling data growth, to back up more data in the same amount of time.
These changes, however, require new processes, new thinking, and probably new hardware and software too.
Hot and cold
Data growth has long since pushed us past the point where we can afford to treat all data equally.
“The first thing to do is to take a step back and find all the data you have,” says Freeform Dynamics analyst Tony Lock.
“Don't start by thinking about data protection. People have tended to protect all their data the same way. What you really need to ask is what are the protection requirements and what are the recovery requirements?”
The recovery time and recovery point objectives (RTO and RPO) are the two key measures of whether your backup has any practical value. They measure your effective ability to recover your systems from backups.
RPO is the length of time between backups (or protection events) and reflects how much recent data might be lost if a failure occurs. RTO is how long it takes to get the data back in place and ready to resume work, a process that can take longer using traditional backup technologies.
Different data has different requirements, so just as you might have multiple tiers for production data depending on the performance needed, backups can be tiered too.
At one extreme, relying on weekly backups destroys the RPO for hot data, while at the other feeding everything into a continuous data protection (CDP) appliance risks wasting an expensive resource on cold data.
The chances are that you will need a combination of approaches, tailoring them to the needs of each class of data.
“You want to keep the classification a simple as possible. Some things you need to back up only once a month, others you need to back up every week or every day, and there are some that might need CDP or a similar real-time approach,” says Lock.
None of this is any good, however, if you don't protect all your applications and data, and recent seismic shifts make it difficult for IT professionals to know where all of their organisation's data is.
Instead of simply being in the data centre, it could be in the cloud, on a departmental server or appliance or, perhaps worse of all, on a multitude of mobile devices wirelessly connected over a variety of public and private networks.
The cloud and mobile aspect is massively changing the backup business, argues Wynn White, chief marketing office at Druva, which develops endpoint backup software.
“Backup can still be disruptive when there's no good solution in place,” he says.
“Current technology assumes you are behind a firewall and that you have a fixed PC, so there's a whole new generation of technology growing up in the cloud that is solving the same problem differently.”
Backup developers are adapting to a hybrid world. Symantec, for example, has adapted Backup Exec to target Amazon AWS or Microsoft Azure storage via a gateway.
As Erica Antony, Symantec's senior director product management, points out, increasingly complex and dynamic IT infrastructures require increasingly sophisticated information management tools.
Recycle and use
Similarly, backup technology for mobile users needs to work across unreliable networks. It must allow the backup to start and stop when a laptop suspends or a mobile device goes out of coverage, and it should be location-aware.
It must also not be disruptive; people will no longer tolerate the sort of thing that happened a decade or two ago, when your PC locked up every Friday afternoon as its weekly backup job kicked off.
More importantly, says White, people are realising that there is a lot more you could do with your backed-up data than simply restoring a crashed system or a deleted file. Centralised and properly indexed, that data could be useful in a range of other ways.
You don’t want your users treating a set of backups as an archive – the two are different applications with different requirements and expectations – but the right data store can serve multiple purposes if it is properly designed.
Plus, that centralisation can simplify your operations and allow you to apply space-saving data reduction techniques such as deduplication and compression.
After all, if you have 50 users all storing a copy of the same file, or 50 virtual machines all running the same operating system, you do not want 50 backups of that; you want all the backups to point to one master copy.
As much as 85 per cent of storage spend is dedicated to managing copies
Some, such as HP, refer to this as federated deduplication because they use it to logically fuse multiple peer backup systems to work together. Others, like ExaGrid, talk of adaptive deduplication, with backup devices cooperating in a grid, all deduplicating their incoming data in parallel.
Either way, IDC research suggests that as much as 85 per cent of storage spend is dedicated to managing copies. Some are generated by users and others by backup processes, requiring you to keep multiple daily, weekly and monthly backups, most of which are substantially identical. Clearly, you can make considerable savings here.
There are caveats of course. One is that the backup must be reassembled – or rehydrated, as some call it – from the content store before recovery. Another is that anyone in a regulated industry needs to check if data reduction is acceptable to regulators.
“There are still one or two regulatory authorities refusing to recognise deduplicated or compressed data as a genuine archive. Those people are hard to convert,” warns Lock.