Hewlett-Packard might have been wrestling with a lot of issues as CEOs and their strategies come and go, but the company's server gurus know a potentially explosive business opportunity when they see it.
That is why HP has put together a new hyperscale business unit inside of its Enterprise Server, Storage, and Networking behemoth and is getting out the door first with a server cluster that makes use of the EnergyCore ARM RISC server chip just announced by upstart Calxeda.
The hyperscale server effort is known as Project Moonshot, and the first server platform to be created under the project is known as Redstone, after the surface-to-surface missile created for the US Army, which was used to launch America's first satellite in 1958 and Alan Shepard, the country's first astronaut, in 1961.
Moonshot, of course, refers to NASA's Apollo project to put a man on the moon by the end of the 1960s, and there will no doubt be a lot of people calling this HP's Project Moneyshot, referring to something just a bit different. But HP's goal with Project Moonshot is larger than just making some money peddling alternatives to x86 chips for cranky Web operators who want do to more work for less money and with less juice. So don't get the wrong idea that Moonshot is just about ARM.
"This is an extension of, not a replacement of, our ProLiant and Integrity server lines," Paul Santeler, general manager of the hyperscale business unit within the Industry Standard Servers and Software division at HP.
And while HP was not in a mood to talk specifics, Santeler said that Moonshot would include super-dense servers based on Intel low-power Xeon and Atom chips and Advanced Micro Devices low-power x86 processors as well as multiple suppliers of ARM-based server chips.
HP is not interested in locking itself into one ARM supplier any more than it can afford to depend on just Intel or AMD alone. In this world, with technical, economic, and natural disasters, you need to at least dual-source key components, and equally more importantly, different ARM server chip variants are going to be good at different things.
As we have reported elsewhere, the EnergyCore chips are one of a handful of ARM processors that have been expressly designed for hyperscale server workloads. The ECX-1000 processors that are the first in the EnergyCore products are 32-bit chips that will come in two-core and four-core variants.
They include memory, I/O, and storage controllers and an embedded Layer 2 switch fabric on the chip, which means you can just wire a 4GB DDR3 memory stick to one, slap on some I/O ports, plug these babies into a passive backplane, and you have interconnected server nodes that take the place of rack servers and top-of-rack switches.
The EnergyCore Fabric Switch embedded on each chip can implement a 2D torus, mesh, fat tree, and butterfly tree topologies and scale across 4,096 sockets (each socket is a server node, since Calxeda is not doing cache coherency across sockets).
The original Calxeda reference design from last year was a 2U rack-mounted chassis that crammed 120 processors (and hence server nodes) into that metal box. With the production-grade ECX-1000 processors. Calxeda has put together a four-node server card with four memory slots and PCI-link interconnects so it can snuggle into a passive backplane to get power and talk to its ARM peers and make a network.
To make the Redstone, HP took a half-width, single-height ProLiant tray server and ripped out just about everything but the tray. In goes the passive backplane that the Calxeda EnergyCard, and HP can cram three rows of these ARM boards, with six per row, for a total of 72 server nodes, in a half-width 2U slot, like this:
An HP Redstone server tray crammed with Calxeda ARM servers
The trays slide into the 4U version of the ProLiant SL6500 chassis, and you can put four of these trays in the chassis thus:
HP Redstone SL6500 chassis fully armed and dangerous to 32-bit parallel workloads
That gives you 288 server nodes in a 4U rack space, or 72 servers per rack unit. That's 20 per cent more server density than the alpha test machine from Calxeda could do earlier this year with very early samples of its ARM chips.
That SL6500 chassis in the Redstone system has three pooled power supplies that can back each other up and keep the nodes going in the even one of them goes the way of all flesh. The system has eight cooling fans. Each tray has four 10Gb/sec links that come off the internal EnergyCore Fabric Switch.
All of these ports can be cross-connected using 10Gb/sec XAUI cables, and scaled across as many as 4,096 sockets. (By the way, 4,000 servers is pretty much the upper scalability limit of a Hadoop cluster these days.)
However, the recommended configuration at first will be to use link the 72 four-server nodes in a single SL6500 to each other with the integrated fabric switch, and then glue multiple SL6500s to each other using a pair of 10GE top-of-raw switches.
In effect, the SL6500 is the new rack, with an integrated top-of-rack switch, and the two external 10GE switches are akin to an end-of-row switch that usually links multiple racks to each other.
That 288 server count on the Redstone system assumes that you are going to network out to external disk arrays, but you can sacrifice some servers in the trays and plug in up to 192 solid state disks or 96 2.5-inch disk drives into an enclosure. The SSD and disk drive cartridges plug into SATA ports on the EnergyCard and draw their power from the backplane in the tray.
The sales pitch for the Redstone systems, says Santeler, is that a half rack of Redstone machines and their external switches implementing 1,600 server nodes has 41 cables, burns 9.9 kilowatts, and costs $1.2m.
A more traditional x86-based cluster doing the same amount of work would only require 400 two-socket Xeon servers, but it would take up 10 racks of space, have 1,600 cables, burn 91 kilowatts, and cost $3.3m. The big, big caveat is, of course, that you need a workload that can scale well on a modestly clocked (1.1GHz or 1.4GHz), four-core server chip that only thinks in 32-bits and only has 4GB of memory.
"There are a lot of customers that I have talked to who think 32-bit is just fine," says Santeler. The chips will probably be good at web serving, web caching, and big data chewing workloads where processing data in smaller bits is the norm, not the exception.
That said, HP doesn't seem to be in a big hurry to commercialize the Redstone machines, but it is getting machines out there as Calxeda is starting to do samples – and that is about as good as it can be.
The ECX-1000 chips from Calxeda are expected to sample late this year, with volume shipments in the middle of the year. HP's Redstone machines using these chips will be available in limited quantities for a limited number of customers in the first half of 2012.
And here's the kicker: Santeler is saying that HP is making no commitments at this time about when it will ship as a generally available product, or even if it will. (I think HP is just being overcautious and dramatic.)
HP is putting Redstone machines into DiscoveryLabs around the world, starting in a data center in Houston, Texas, where its PC and server factory is, and will give potential customers a chance to upload their programs on the Redstone servers and put them through the 32-bit ARM paces. And for now, that means doing so on Canonical Ubuntu or Red Hat Fedora Linux, which support ARM chips and which have been tweaked to support the Calxeda chips.
Incidentally, HP has no plans at this time for putting ARM processors in its general purpose ProLiant DL rack, ProLiant BL blade, or ProLiant ML tower servers. ®