IBM's FlashSystem looks flashy enough, but peek under the hood...
Old tech, new tricks. Hey, if it works, why not?
Storage Architect This week IBM announced three new flash products, two of which are based on existing technology.
Mainframe/Power customers got the all-flash DS8888, with the A9000 and A9000R models covering the rest of the market.
What’s interesting about these latter two products is that they are based on Spectrum Accelerate, otherwise known as the software behind the XIV platform.
Speeds and feeds
Let’s quickly run through some product specifics. The A9000 is an 8U appliance that is built from three controllers and a flash enclosure, connected together using dual 56Gbit/s Infiniband switches. The 12 drive slots can be populated by either 1.2TB, 2.9TB or 5.7TB MicroLatency modules, providing raw capacity from 14.TB to 68.4TB, or a claimed 60TB – 300TB with the rather specific 5.26:1 data reduction savings.
Latency is a claimed 250 microseconds (more on that later), with up to 500,000 IOPS on a traditional 70/30 read/write workload. Multiple A9000 systems can be managed together using new Hyper-Scale Manager software.
The A9000R is a more scalable platform, built from between two and six “grid elements”. A grid element is two controllers and one flash enclosure, with all of the components connected to each other using two 56Gbit/s Infiniband switches. Unlike some architectures (e.g. XtremIO), all controllers can talk to all flash enclosures, because the flash enclosures are basically FlashSystem 900 units connected to the shared network.
Systems can be built using either 2.9TB or 5.7TB MicroLatency modules, delivering a usable capacity of 300TB to 900TB (using 2.9TB drives) or 600TB to 1.8PB (using 5.7TB drives). Again these figures are using a claimed reduction of 5.26:1, 70/30 read/write ratio, delivering 250 microseconds latency and up to two million IOPS.
I’m somewhat intrigued and horrified in equal measure that IBM have decided to use XIV as the basis of their latest scale-out all-flash solution. On the positive side, I can see the benefits of using the software-based benefits of XIV – it has mature support for VMware environments, can deliver quality of service, multi-tenancy, BC/DR through replication and better data optimisation than the V9000 with de-duplication and compression.
IBM A9000R rear view
The grid architecture provides the ability to use all of the infrastructure in a massively parallel way, although two million IOPS for a full A9000R seems a little low. However, the over-engineering is enormous. Take a look at this side picture (on the right) of the wiring as seen from the rear of a fully populated system. It's not easily serviceable if a node or shelf failure occurs.
Then there’s the environmental factors. Achieving 300TB in 8U isn’t revolutionary – Pure can do 250TB+ in 7U (FlashArray//m) and HP 3PAR 8450 could achieve 370TB in 8U. As for power, A9000 maximum is 2.91KW, compared to 1.5KW for the Pure system quoted. A fully populated A9000R consumes a maximum of 13.91kW.
However, the point here is that both configurations seem hugely imbalanced towards the controller rather than enclosure hardware. (Note: Figures will vary based on claimed and actual dedupe/compression figures).
When XIV first came to the market in 2005, the architecture was interesting and new. However the platform is now well over ten years old; in fact, the original company was founded in 2002. Successive generations of the product have introduced new features and addressed the achilles heel of XIV, namely the shared backplane/network, originally based on 1GbitE Ethernet, replaced by 10GbitE in Gen2 and Infiniband in Gen3.
I/O performance was always limited by the backend network. However the inherent design and distribution of data means other issues remain, like the habit of rounding (up) LUNs to the nearest 17GB in capacity, which in the software-only implementation you’d think would have been overcome.
A word on minimum latency
While we’re at it, let’s touch on the “minimum latency” reference. IBM quotes 250 microseconds as the minimum, without reference to whether this covers reads or writes. A single FlashSystem 900 has latency figures of 90 microseconds (write) and 155 microseconds (read), so you can assume that the A9000 series is adding 50-100 per cent overhead.
The issue here is more to do with the term “minimum”, indicating that this is the best performance you can expect to receive, rather than any average. I raised this question during my briefing on A9000 and was recommended to look at the SPC testing figures for FlashSystem 900 as a guide.
Testing was performed in October 2015 (link here, PDF) and the graph on page seven shows how response time increases with increasing load. At 10 per cent load, response time was 240 microseconds (average), rising to 490 microseconds at 100 per cent load.
We can therefore make assumptions that A9000 systems may see anywhere from 250-500 microseconds response time, depending on the system load. Perhaps IBM will submit A9000 tests to make this more clear and transparent.
The Architect’s View
In one respect I can see why IBM has chosen to re-use technology IP they already own and to develop all-flash solutions from existing architectures. There are good compatibility and management benefits – data can be migrated into and out of all-flash systems quite easily.
However, I wonder whether the lack of original product design or enhancement of the TMS acquisition (other than increasing the capacity of MicroLatency modules) points to a desire by IBM to limit their exposure to new hardware design costs, especially in storage. Only 3PAR springs to mind as a company that has taken their existing solution and directly enhanced it with flash. Pretty much everyone else in the market has built products from scratch.
Of course, hardware and technical elegance doesn’t directly translate into having the best product. Any TCO view has to consider cost, operational aspects, the support model and so on. Taking the holistic view, A9000/R may be a great solution for existing IBM customers who have invested in XIV skills. However I’m not sure what this announcement says for the long term viability of IBM storage; currently I think they are hedging their bets. ®