This article is more than 1 year old
EMC’s DSSD all-flash array hits the streets, boasting 10m IOPS
Rack scale flashery from EMC abolishes network latency
+Comment EMC has launched its all-flash, rack-scale DSSD D5 array* offering 10 million IOPS and 144TB in 5U of rack space.
Other headline numbers for what EMC calls its rack-scale flash array are 100 microsecond latency and 100GB/sec bandwidth.
Consider the D5 product as being suited for “emerging next-generation applications based on extremely large, fast-growing working sets with 100 per cent hot and active data.”
There are up to 36 flash drives (modules) with 144TB of raw capacity – 4TB/module – and 100TB of usable capacity. We understand the modules are custom-designed. WMC expects this flash drive technology to appear in XtremIO, VNX and VMAX all-flash systems in the future.
EMC says the system has enterprise-class availability and serviceability features;
- Dual-ported client cards
- Dual H/A controllers
- Redundant components
Flash module reliability is helped with technology EMC calls Cubic RAID – some form of multi-dimensional RAID scheme, dynamic wear levelling, flash physics control – no details – and space-time garbage collection.
The shared storage DSSD D5 supports up to 48 connected servers with redundant NVMe PCIe gen 3 connectivity to each node. There is an NVMe PCIe-based network – mesh – with separate control and data paths. It's called the world's largest PCIe mesh.
This network or fabric can cover connectivity distances found inside a rack and between adjacent racks.
Each flash module is connected to the PCIe mesh via two separate PCIe gen 3 x4 lane connections providing up to 8GB/sec of bandwidth to each module. and parallel access to thousands of flash dice (dies).
The D5 has two Control Modules with fully redundant, active-active high-availability. They manage IO as a control plane. Data flows directly between rack server clients and the flash modules through IO Modules – the data plane – with direct memory access.
All major D5 components are redundant and field-replacable. IO is atomic and protected from loss through power failure.
Either 2TB or 4TB flash modules are available.
There is no mention of deduplication and compression by EMC; the D5 being a machine optimised for sheer horsepower grunt. According to Jeremy Burton, President, Products and Marketing at EMC II, the DSSD systems are unlikely to get compression and deduplication, as putting these data efficiency facilities in the data path will detract from the D5's performance.
Positioning
The D5 is for applications built on top of Hadoop, high performance databases and data warehouses as well as custom applications used for complex, real-time data processing and real-time analytics and insight.
EMC says customers can consolidate multiple apps onto the D5 platform, and “simplify data warehouses by eliminating multiple copies of data, complex indexing, intricate partitioning, and the need for materialised views.“
EMC said its VCE Converged Infrastructure platforms will incorporate the DSSD D5 to expand its flash offerings. We might also wonder about deduplication and compression being on DSSD's roadmap.
Performance
Compared to other all-flash arrays EMC says the D5 has up to 68 per cent lower TCO, five times lower latency and ten times higher IOPS and bandwidth than today’s fastest flash platforms. It has an order of magnitude improvement in Hadoop HBase workload performance compared to traditional Hadoop deployed on Direct Attached Storage (DAS).
We're told that Big Data analytics customers can independently scale compute and storage, and write only one copy of data on the D5's flash, irrespective of the HDFS replication factor.
EMC says applications such as genetic sequencing calculations, fraud detection, credit card authorisation, and advanced analytics can have their speed increased as much as ten times.
“DSSD D5 accelerates current databases and data warehouse solutions, such as Oracle, through an innovative low-latency data path to deliver 3X lower latency, one-fifth the rack space and 68 per cent lower TCO than the highest published performance Oracle solution.”
Brian Dougherty, Chief Technical Architect, CMA, said: “DSSD D5 has fundamentally changed our business by eliminating unnecessary software, hardware and pre-processed batch jobs. With DSSD, we’re able to support applications and analytics at never-before-seen speeds.”
There is a jointly developed HDFS plug-in from Cloudera and EMC to accelerate Big Data applications and tools. Mike Olson, Chief Strategy Officer at Cloudera, said; “Cloudera has tested the DSSD D5 appliance in our lab, and we've seen an order of magnitude increase in performance. It’s the fastest HBase cluster we’ve ever tested.”
The 4TB flash modules could become 8TB ones later this year and even 32TB in 2017 as EMC rides the NAND development curve.
El Reg says
There is one other shared all-flash array connected by NVMe PCIe and that is the Mangstor box.
Stealthy startup E8 says it's developing a 10 million IOPS flash array.
These and the DSSD D5 represent a new class of fundamentally faster external storage. It's not only the fact that they are all-flash but their PCIe-class access speed that means 8 and 16Gbit/s Fibre Channel-connected arrays and 10GbitE iSCSI-linked arrays cannot match their speed.
No other supplier has a storage array equivalent in performance to the D5; it stands alone.
Alone amongst the existing all-flash array suppliers apart from Mangstor, that is, while Pure Storage is making positive statements about NVMe over fabric connectivity. It is our understanding that all all-flash arrays, without exception, will have to prepare to support connectivity at this speed.
We think HCIA vendors may start considering NVMe fabric technology to link their cluster nodes as well.
EMC DSSD D5 is generally available in March 2016. ®
* EMC says the D5 is direct-attached storage (DAS) and not an array, as in networked storage array.