Next year, the seventh-generation LTO tape format will be hitting the market. We are talking about 15TB cartridges. It sounds like an interesting media and it actually is if you need to store huge amounts of cold data.
Cost/GB is still the best on the market for cold data, but I think there are other problems to be considered.
15TB (compressed) is a lot of data, even for today’s standards. It means that it stores 2.5 times the data of an LTO-6 (6.25TB) tape in a similar space. I’m not well informed on the physical characteristics of the magnetic tape, but even if it’s thinner and longer than it’s predecessor, it won’t be much different and the new tapes are only likely to be denser in terms of MB/mm2.
The tape drives will work in the same way as they always have – they have to maintain a sustained and regular/constant speed to write data – which means that you have to write data faster. In fact, LTO-7 tape drive throughput is 750MB/sec.
If you don’t sustain the throughput, the tape drive won’t have enough data to write and would have to stop/pause, fill the data buffer and reposition itself before restarting. Each one of these stops takes several seconds and it will heavily impact performance or real throughput.
This is a well-known problem, which has already been seen and addressed in the past. Disk staging areas, multiplexing and other techniques have always mitigated the problem.
... it’s not about back-up
The real problem is that we don’t use these tapes for back-up only. Some of them are supposed to be used for archiving too. In this case, traditional optimisations don’t work, especially when data is written and accessed with a very random pattern.
SpectraLogic's BlackPearl appliance – a gateway that sits in front of a tape library allowing access to data through an S3 object storage API – is an interesting solution to cope with this problem, but it’s also the only one I’ve heard of.
And, in any case, I wonder if this approach will be any good once LTO capacity increases to 60 or 120 TB in the next few years.
I’m just thinking about the quantity of objects than can be stored in those cartridges. A capacity of 100TB+ could mean many millions of objects – even if they were accessed a few times in a long period, the number of accesses to the tape could be very important.
Think about photos: an LTO-7 can store 15,000,000MB. A single photo can be 3-5MB. That means between 3-5 million objects, just on one tape. Now, even if the data is rarely accessed, the risk is that a single tape will be hit often with contention problems, such as media longevity and so on.
Closing the circle
Tapes have the best $/GB, but they are becoming more and more complex to access. Long retention back-ups and cold archive are the primary use cases, but their huge capacity could become a problem because of the high concentration of data in a single cartridge.
You know, I’m an object storage fan and maybe this is the best way to manage huge amounts of data if you need to access it more than once in its lifetime.
I must admit that I need to dig deeper into this topic before commenting further, however. And I hope to have some more answers after the SpectraLogic summit next month in Denver. ®