Where flash shines
What is enticing about flash for archive technologists is that adding bits to a flash cell decreases its cost and increases its capacity. A flash wafer has a defined surface area, and generally speaking a flash cell has a similar size – whether it stores 1bit (single-level cell or SLC), 2bits (multi-level cell or MLC) or 3bits (triple level cell or TLC).
That sounds great but every time you add a bit to a cell you lower the access speed and also shorten its working life – the number of writes it can handle before it effectively dies.
This phenomenon is exacerbated by the continuing development of shrinking the size of flash cells, or their geometry, to increase the amount of data that can be stood in a physical area, both in flash chips and, correspondingly, on flash wafers.
Here is a graph showing the endurance differences at different cell geometry sizes for MLC and TLC flash:
Flash endurance ranking
The chart also shows the endurance differences between MLC and TLC flash, with TLC flash endurance being measured in the hundreds of writes and MLC in the thousands as the geometry heads into the 2X nm space, meaning 29nm - 20nm.
For the archive use case where the data rewrite level would be near zero this does not matter.
Flash costs are dropping, some would say faster than disk costs, and deduplication and compression can lower flash costs even more. Primary data flash storage systems can use deduplication without compromising data access speed much, if at all, whereas disk arrays cannot reduplicate their data in the same way as access time increases.
For this reason primary, high-access data storage is moving from disk to flash. But this does not work with archives where both disk and flash can use deduplication.
Extrapolating the rate of TLC flash cost decreases shows that it will cross over SAS interface disk costs in around 2017 and continue falling faster than disk generally, trending down towards cheaper SATA disk costs by 2020 and looking not likely to match it for another decade or so.
Here is a chart showing this:
NetApp chart of TLC flash and SAS disk price changes over time. The two curves are related to the SATA disk price change curve which is normalised to 1.0 over time.
The take-away here is that, GB for GB, TLC flash still costs more than disk, which costs more than tape. Therefore there is little or no prospect of TLC flash replacing tape as an archive medium.
The big picture
Consultants at Wikibon have suggested that flash may still be used in archiving, but only for large objects to hold metadata, with the bulk of data still stored on tape. They have christened this twin technology concept FLAPE.
"The combination of tape and flash, for large objects/files, is going to offer not only lower costs, but also much higher performance than spinning disk-based alternatives," they say.
"The key to this approach will be surfacing metadata, currently buried on tape cartridges, to a flash layer that can signal the location of desired data on the tape. Combined with linear tape file system technology, we believe this approach will deliver better business value for the right use cases."
A table of hardware costs Wikibon has prepared shows that, looking at 10-year cumulative hardware costs, a disk-only archive costs $5.5m while tape-only is far lower at $0.8m.
Disk plus flash costs up to $7.3m while FLAPE, flash plus tape, can cost up to $2.6m: advantage FLAPE if you need fast data access with the lowest per-GB storage cost.
Read the Wikibon FLAPE report here. ®
* Although disk is faster at getting to the start of a file than tape, tape is actually faster than disk when streaming large files, according to Wikibon.