Facebook puts some brains in Open Vault JBOD storage
ARM or Atom, pick your embedded CPU and interconnect poison
Open Compute 2013 At last week's Open Compute Summit 2013, the people behind the open source hardware project were showing off some enhancements for the Open Vault JBOD storage array that Facebook has cooked up for its own use in its two newest data centers and presumably will be added to its existing data center.
The Open Vault array, known by its code-name "Knox," has been contributed by the social network to the Open Compute Project open source hardware design effort. Among other things, Open Vault is used for cold storage of the 240 billion photos on the site, which is growing at a rate of 350 million per day, or 7PB per month. (You can see the Open Vault specs here.)
Open Vault is a JBOD array, which means it is just a bunch of disks and is intended to hang off a SAS controller inside of a server. In Facebook's case, that is a custom Open Compute V2 server using Intel's custom "Windmill" two-socket Xeon E5 server node.
The Open Vault array has two 1U disk trays, which each hold fifteen 3.5-inch SAS drives and two SAS expander boards. The four SAS extender boards feed back to the server and make all of the 30 drives in the Open Vault look like they are connected directly to the server. Open Vault is designed so any disk or any one SAS expander can be changed without having to take the JBOD offline.
While Open Vault is great for what it does, it lacks brains. And so, through an extension of the Open Vault called "Knockout," ARM server chip upstart Calxeda and Intel are both working on variants that put some brains and internetworking into each JBOD to turn it into a smarter storage cluster.
Frank Frankovsky, vice president of hardware design and supply chain at Facebook and also chairman of the Open Compute effort, showed off two compute boards that slide into the Open Vault array where the SAS extenders currently fit and give them a bit of brains, like the Scarecrow in the Wizard of Oz.
The first one that Frankovsky showed off – and one that is close to being in production – is an ARM-based compute add-on card that is based on the 32-bit ECX-1000 processor from Calxeda:
A Calxeda ARM server node for the Open Vault JBOD
The idea, Gina Longoria, product marketing manager at Calxeda, explains to El Reg is to allow for companies deploying Open Vault storage to possibly run Lustre or Gluster clustered file system code inside each Open Vault tray and maybe only use an x86 node in the rack to run a head node.
The additional computing power could also be used to run other storage software, such as the Ceph distributed object store that is being closely affiliated with OpenStack, or even the Cassandra NoSQL data store that was created by Facebook when it ran up against the limits of MySQL relational databases.
The precise software that can be run on an intelligent storage server is not the point. Giving Open Vault some cheap yet power brains is the point.
Intel wants a piece of this action, too, and is a good buddy of the Open Compute Project as well, and thus Frankovsky was careful to hold up a similar brain transplant card based on Intel's forthcoming "Avoton" Atom S Series processor, which is expected to have on-chip Ethernet links:
An Intel "Avoton" Atom server node for Open Vault storage
The feeds and speeds of the Intel board were not divulged, but Calxeda was happy to talk about different configurations of its Knockout compute and networking cards for the Open Vault JBOD.
The Calxeda board has a single ECX-1000 processor with four Cortex-A9 cores running at 1.4GHz and 4GB of DDR3 main memory running at 1.33GHz. The board can have two RJ45 ports running at gigabit speeds and five SATA port mulitpliers, supporting all drives in a single Open Vault tray - you put one in each tray.
The card can be equipped to run software RAID or to run iSCSI target software, mapping from an x86 head node at the top of the rack. You can also have SFP+ or QSFP ports put on this card if you want to spend a little more money.
The current "Knox" Open Vault and its computationally enhanced "Knockout" derivative
Or, if you want to go with the cheaper and better option, you could use CX4 connectors and use the on-chip distributed Layer 2 network on the ECX-1000 chips to be real clever. First, you could put a 24-port Gigabit Ethernet switch between the Windmill head node and the computationally enhanced Open Vault JBODs.
This switch would link the JBODs to other Windmill head nodes for redundancy, eliminating a single point of failure in the rack. Then you could add data compression, hashing, or other algorithms on the local ARM nodes, or Atom-based nodes, too.
By tucking the ECX-1000 server nodes into the Open Vault JBODs, however, you can do one other thing: cross-couple the arrays and their compute nodes across racks. Here's one example:
How you might use the interconnect on ECX-1000s to do a 2D torus between storage JBODs
With the variant of the Knockout server board that has four 10GE ports coming off the ECX-1000 chip, you can turn on the integrated Fleet Services fabric and use the top of rack switch to handle north-south traffic out of the array and to the network, feeding applications, and use the Fleet Services interconnect to provide data replication and other services on an east-west network that spans multiple racks.
All of this can happen under the covers of the Open Vault and behind the scenes where the head node in a storage cluster is blissfully unaware. This would also mean that the x86 node could potentially be quite a bit less powerful in terms of memory and CPU capacity, and in fact, you could have an array of ARM servers acting as the head node if you wanted, according to Longoria.
This is, of course, the option that Calxeda is excited about. ®