Blog I’ve never been fond of VTLs (virtual tape libraries). I like deduplication but it quickly became a feature and is not a product. And lately we all want more out of backups, don’t we?
Data Domain did a great job in building a brilliant deduplication-based appliance in 2001, but that was 2001. And, while it lasted, having the ability to write backups faster than usual and taking advantage of great space efficiency was a good thing… but time passes and alternatives are easier to find. This makes the VTL less appealing than in the past (lately also visible in EMC's financial results).
Another technology I’ve always been fond of is ZFS. It’s also super easy to build a ZFS-based box! You can buy ready-to-run appliances or build them starting from scratch (with OpenSolaris/Illumos, Ubuntu and maybe other OSes). With off-the-shelf commodity hardware you can build a 4U 45/60 disk box. We’re talking about up to 480TB raw (w/ 8TB disks, before RAID, compression and dedupe), which could easily become 3 times that with compression and dedupe (almost 1.5PB!). It’s a lot of space and comes at a low cost.
You have a preference for Microsoft? Not a problem – do it with Windows Server. It works like a charm thanks to Storage Spaces, dedupe and many other features that do the same work as a ZFS-based box (probably even better if you think about all the fancy things you can do around SMB3 and SOFS).
What about Linux? Do it with Ceph. Building a scale-out cluster could be a little trickier than a single host, but in this case you can have much more scalability and several additional features.
Either way, you can save a lot of money while achieving (very) good results and taking full advantage of a solution built out of a piece of commodity hardware and an operating system. A VTL is more efficient you say? I’d like to see some proof… especially by putting TCO, TCA and real world backup jobs on the same table.
But yes, I agree that this is not enough sometimes.
Take the VTL for what it is
At the end of the day, a VTL is just a dumb data repository; you want it reliable, cost-effective, scalable, easy-to-use and efficient. What about a different approach then? Object storage is cheaper than a VTL with better features in terms of e-vaulting (aka replication, in this case), TCO, reliability, durability and it can scale much more than any VTL.
An object storage system is more flexible than a box which supports only a few protocols for a specific use case. You can ask for the “traditional” gateway approach (like NetApp AltaVault for example), which gives you the ability to have a VTL-like front-end with all the flexibility of an object store at the back-end, or you can write directly to an S3 repository. The latter is becoming quite common in enterprise backup software and is supported by many object storage vendors.
Scalability is no longer an issue and the storage repository can do more than backups only – much more. But if this is not enough for you, there is a third way to rethink VTLs today.
Or you can ask for more than a VTL
You prefer smart over dumb? Why not put everything together? Some new startups, like Cohesity for example, have totally redefined the concept of secondary storage and data protection. And consequently, the role of the VTL too.
In fact, this type of appliance has the ability to ingest data through its integrated backup system, or by leveraging backup software like Veeam. But contrary to what happens with a dumb storage system like a VTL, data ingestion is only the first step in a much more complete and exhaustive process.
In fact, once data is in the system it’s possible to search it directly in Google-like fashion, analyse it and make copies for other uses (like test/dev for example). The options are plentiful, as are the potential savings brought about by consolidation, optimisation and centralisation of several secondary workloads. And these appliances can also leverage cloud for tiering, archiving and disaster recovery. At the end of the day, we are dealing with another level of efficiency that leads to faster and integrated backups, better data management and quicker restores.
Closing the circle
VTL is dead! Just kidding… but VTLs are becoming less appealing to modern infrastructures. They simply no longer make any sense, but please take this comment with a grain of salt, as I’ve never been fond of VTLs (when I was a Sun Microsystems VAR, I was the greatest fan of the Sun Fire x4500 and they always served like the best of the VTLs for my customers.
VTLs don’t usually scale enough, they are not efficient enough, they are not flexible enough. On the other hand, startups like Cohesity have coined the term Hyper-converged Secondary Storage. I don’t know if they have chosen the right words to describe it, but I like what they are doing.
When you think about HCI, you think about storage and compute collapsed together, and end users like it because they want to simplify their infrastructure. For secondary storage it’s the same, but it’s not about storage and compute – instead it is about collapsing data and data services together to simplify data management. ®