Storage Architect DataCore has been active over recent months with benchmarks based on their new SANsymphony Parallel Server offering. The most recent of these claims 5.1 million SPC-1 IOPS at $0.08/SPC-1 IOPS and 0.32 millisecond response time.
Other vendors are crying foul on these results, claiming they don't represent a true test because all of the data is held in memory. So, is it fair to put all of your data in DRAM or is this simply gaming the test?
In this discussion I can see a number of clear issues:
- Putting all of your data in cache isn't cheating. In fact, if cost/benefit analysis can justify it, we should be caching as much data as possible. In-memory databases and products like PernixData FVP and Infinio Accelerator specifically aim to keep as much data as possible in the cache (as DRAM or flash) rather than write to external storage.
- Cache miss is an issue. What we have to look at is what happens to I/O response time for data not in the cache or when the cache becomes fully loaded. If we never reach this point though, who cares if all the data is in memory? This would be a good testing point for the DataCore solution.
- Caching isn't persistent storage. In general, caching I/O isn't the same as serving it off persistent storage. Cache is volatile and needs warmup time as well as additional protection. If data isn't in cache and has to be retrieved from the backing store, then that I/O could suffer. If I/O response time has to be 100 per cent guaranteed, then data should sit on flash.
- With benchmarks, caveat emptor. All benchmarks can be gamed in one way or another. Benchmark workload profiles rarely match real-world applications and there's no replacement for running proofs of concept to validate vendor claims (check out my posts on storage performance).
In an ideal world, all of our data would sit on the fastest media possible. However compromises have to be made; servers will only hold a certain amount of DRAM; DRAM is volatile; DRAM is (relatively) expensive; we like persistence in our data; we have mobility requirements for our data.
For all of these reasons, keeping everything in DRAM and nowhere else isn't practical. However if we can serve the vast majority of I/O requests from cache, then we're in a good place. This is what storage arrays have been doing since EMC introduced the ICDA (eg, Symmetrix) in the early 1990s.
The architect's view
Naturally, DataCore is presenting its product in the best light possible. Every vendor bar none does this and will highlight the benefits of their offerings without discussing the shortcomings. Benchmarks, including SPC-1, are far from perfect – for example, systems that have always-on data optimization features aren't supported for testing.
However, it also wouldn't be practical to continually update the benchmark specification. Testing is expensive and vendors can't afford to be running benchmarks regularly, which they'd have to do if the specification was continually changing. Otherwise, there'd be no way to do realistic vendor-to-vendor comparisons.
Just remember, there's no substitute for doing your own testing, preferably with your own workload. Use the benchmarks in the way they were intended – as a guideline rather than a definitive statement of capability. ®