Analysis You wait for a bus for ages, and then two come along at once. Two data-transfer buses. Something like that.
Both OpenIO and Igneous have launched plug-on ARM server cards for storage drives: these single-board computers each snap onto a hard drive to form nano-servers that are organized into a grid of object storage nodes.
Igneous’s CEO Kiran Bhageshpur cryptically declares that “we have only half-announced the Igneous Data Service,” and leaves us humble hacks wondering what he is on about.
El Reg covered Igneous when it launched its service in October. There were two parts to Igneous’ concept: a subscription service to a managed on-premises storage array presented like a public cloud S3 API-accessed storage service; and the actual technology, with two 1U stateless x86 dataRouters (aka controllers) fronting a 4U dataBox containing 60 nano-servers (ARM server + 1Gbit/s Ethernet port + 3.5-inch disk drive).
Igneous was founded in 2013, with lots of Isilon experience in its seven founders’ backgrounds. It gained $3.1 million in seed funding that year, and received $23.6 million in an A-round in 2014. Two years later, it has announced initial hardware and software product availability.
At the Silicon Valley briefing, we had a chance to find out more about this object storage box. We’ll concentrate on the technology here and why it was developed the way it has been, rather than on the cloud-focussed subscription service.
Igenous CEO and cofounder Kiran Bhageshpur
The dataBox is a JBOND (just a bunch of networked drives), and Igneous says: “Each nano-server has its own ARM-based processor, memory, disk controller, redundant Ethernet controllers, and boot flash. In essence, every drive itself is a server that talks IPv6 over Ethernet. Nano-servers can simply fail in place.”
Each strap-on board has a 32-bit ARMv7 1GHz dual-ARM-Cortex-A9-core Marvell Armada 370 system-on-chip that includes two 1Gbps Ethernet ports and can talk SATA to the direct-attached 3.5-inch disk drive. That processor gives the card enough compute power to run Linux right up against the storage.
The JBOND has 212TB usable from 60 x 6TB drives organized in 2 x (20 + 8) erasure coded drive sets. The dataRouter, an original design manufacture from Intel, is responsible for the management and remote control of an Igneous deployment.
Two dataRouters are preferable for high availability, and dataRouter and dataBox scale independently. Igneous suggests that, with 4 dataBoxes (848TB), 3 x dataRouters are generally deployed and recommended, still offering an N+1 redundancy.
Bhageshpur talked about data-centric computing with IoT, sensor, and media data growing in scale. He said on-premises data storage was needed because certain kinds of data had to be stored securely, meaning behind a customer’s firewall. The use of nano-servers restricts the failure domain size to the individual disk drive, with drives able to fail in place, and he asserted that giving each drive its own ARM server and Ethernet port meant there were no bottlenecks in the system.
He discussed Seagate’s Kinetic drives, saying Seagate did not let you run code on the Kinetic drives, and these drives have never been put into production. A failed startup called Coraid had Ethernet-addressed storage, using the AoE protocol, but Coraid was too early and failed.
The software stack running in the Igneous hardware provides patented Reed-Solomon-based erasure coding. Nano-servers’ interactions are coordinated by software running in the dataRouter and in the Igneous cloud. The dataBox implements a fully-distributed, key:value store, and performance can be up to 650MB/s for large sequential GETs and PUTs with concurrent streams. The goal is to be as performant as S3 from an overall capacity point of view.
It is extensible to search and other (micro) services, cloud native services, with Bhageshpur mentioning data inspection, indexing, transforming and reducing as potential services. For example, data could be searched for based on metadata created by applications instead of only the storage attributes created by a filer. Check out this video for more insight into this.
We're reminded of Coho Data and its technology enabling storage-related services to run directly in its array. That product does not, though, use Ethernet-addressed drives.
CTO Jeff Hughes talks about classifying images in machine learning in the video. For example, when image data is written (on PUT) it could also be classified. Hughes says it’s like bring cloud functions/services like Amazon’s Lambda directly into a customer’s data center.
He also says Igneous has made it possible to run containerized data services directly inline with data operations without impacting latency. Target applications would involve datasets generated by machines, with a need for low latency and high throughput. He said initial use-cases could be auto-tagging, metadata extraction, image processing, and text analysis.
Bhageshpur told us: “We will offer services leveraging our architecture in the future,” and that the architecture allows them to add more compute and SSDs. He said Igneous will build services, as will partners and customers – but not write code to run on the nano-servers directly. The Igneous system is a black box that runs S3 services.
We asked about HDFS support. The answer was “not today.” When pressed, he said: “At present, the Igneous appliance supports object storage via the use of Amazon’s S3 API. It also supports FTP for data migration. We will support other protocols as we go forward. Azure store maybe.”
We imagine partners could add protocol converters to the dataRouters, which are x86 Linux systems.
Bhageshpur’s backing VCs and customers get much more information about these promised Igneous data services, but only under a non-disclosure agreement, which we weren’t offered, of course.
That left us frustrated, and scrabbling to try and imagine what the heck was sexy enough about these coming services to unlock VC and customer wallets. We can’t get that excited about the system because we simply don’t know enough about it.
Getting back to the mundane, Igneous has white papers covering Igneous system use by Arq and Commvault for backup, CloudBerry and Cyberduck, infinite IO, s3cmd and Storage Made Easy.
Igneous’ pricing for the minimum 212TB usable configuration is less than $40,000/year for 212TiB increments, which equates to less than $200/TB/year or around 1.5 cents per GB per month. Put another way, 1$/year gets you 5.3GB of usable capacity, or 1 usable TB costs $188 per year. ®