Mystery startup uncloaks 512-core server
Atom bomb, network included
The mystery behind secretive server startup SeaMicro is dispelled today as the venture-backed maker of what it has been calling "data center appliances" unveils its first product: the SM10000, a server cluster comprised of 512 of Intel's Atom processors with a built-in, virtualized network fabric for the servers.
The SM10000 passes the TPM Server Test of having an elegant design: mainly, I want one, and I am not even sure why. I'll figure out what to do with it later. Probably something stupid, like turning it into a giant MapReduce box that uses log tables instead of floating point math units to do calculations, just to see if that would work. In recent years, the server designs from Fabric7, Liquid Computing, and 3Leaf Systems have all passed this test, as did the original "SledgeHammer" Opteron chips from Advanced Micro Devices and their clones, the "Nehalem" processors from Intel from last year.
Ditto the Nvidia Tesla 20 GPU co-processors, the Power7 IH supercomputing nodes used in the "Blue Waters" super, and some of Sun Microsystems very elegant Sun Fire designs from a few years back; so, too, for many Mini-ITX, Nano-ITX, and Pico-ITX system boards for homemade, low-power servers. Clearly, passing the TPM Server Test doesn't necessarily lead to riches, so it is of dubious value.
SeaMicro has obtained $25m in venture funding from Khosla Ventures, Draper Fisher Jurvetson, Crosslink Capital, and an unnamed private backer. The company was also, you will recall, one of the vendors that received a slice of a $47m grant in January by the US Department of Energy to come up with some greener technologies for the data center.
SeaMicro got the second-biggest slice of the DOE money, which was part of the $787bn Obama administration stimulus package, landing a $9.3m grant to field test a machine that puts hundreds of low-powered servers into a single box. SeaMicro said it could cut power consumption by 75 percent compared to x64 alternatives in its proposal. The rumor last fall was that SeaMicro was working on a server that would cram as many as 80 processors, perhaps Intel Atoms, perhaps ARM RISC chips, into a single chassis with a direct mesh fabric. The mesh is correct, but the processor count is way low.
The SM10000 does not have 10000 cores, as the name might seem to suggest, but does put 512 individual servers based on the single-core Atom Z530 processor into a 10U chassis, which is a neat trick. And one that the techies who used to work at AMD, Cisco Systems, Force10 Networks, Juniper Networks, and Sun were able to pull off.
SeaMicro was founded in July 2007 by Andrew Feldman, who formerly headed up marketing at Force10, and Gary Lauterbach, an AMD chip designer who was also responsible for putting together Sun's UltraSparc-III and UltraSparc-IV processors. Feldman and Lauterbach looked at the modern, hyperscale workloads that were starting to take over the data centers of the world and came to the conclusion that the complex x64, RISC, and Itanium processors - well suited to deal with predictable workloads solving complex problems within a single company's application mix in a predictable and orchestrated fashion - were wickedly unsuited for the relatively simple, but massively-scaled big data jobs that companies want to run efficiently and cheaply.
"The reason why power is not an issue is that workloads have changed in the data center," explains Feldman. "Now companies have smaller workloads, and they are bursty in nature. And today's systems are particularly bad because they have all these feature that suck power - out of order speculation branch prediction, and so forth - that are not particularly useful for these kinds of new workload and that consume lots of power. The end result is that we are taking the Space Shuttle to the grocery store."
So SeaMicro looked at all kinds of low-powered, relatively simple processors that it might base its data center appliances on, including VIA Technologies' Nano, low-voltage x64 parts from Intel and AMD, and even ARM processors commonly used in handhelds and cell phones. While SeaMicro thought the future "Bobcat" processors from AMD were interesting, they would not get to market in time, and among the Nano, ARM, and Atom alternatives, Feldman says that the single-core Atom offers the best bang for the buck and the added benefit - some might say absolute requirement - of compatibility with the x64 architecture. By SeaMicro's reckoning, on Internet-style workloads - search, Map/Reduce and Hadoop, social networking apps, and such - the Atom core offers about 3.2 times the performance per watt of a Xeon or Opteron core. And the box can run Windows or Linux applications unchanged.
Getting the right CPU for the job was only one third of the battle, however, because in a modern server, processors only account for about a third of the total power consumption. Chipsets, memory, networking (including on-server network ports and the external switch), peripheral I/O account for the other two-thirds of the juice that gets sucked out of the wall. And so SeaMicro created what is in essence a supercomputer interconnection fabric that also virtualizes the memory and I/O for tiny Atom-based servers, many of which are crammed onto a single motherboard, with many of these mobos plugged into the fabric using plain old PCI-Express links.
That backplane virtualizes the networking and I/O for each Atom server and also includes an integrated switch, a load balancer, and a terminal server for all the servers in the box. This really is a single box compute cluster, and it also has room for integrated disks.
The secret sauce in the SeaMicro design is an ASIC chip that virtualizes disk access and Ethernet networking for each of the Atom servers. The ASIC also implements a 3D torus interconnect between all of the server nodes, which is similar to the interconnect that IBM developed for its BlueGene massively parallel Linux supercomputer and which delivers 1.28 Tb/sec of aggregate bandwidth across the 64 server motherboards and 512 cores inside the SM10000 chassis.
SeaMicro also came up with its own field programmable gate array (FPGA) to do load balancing across the machines in a very efficient manner. The load balancing electronics are hooked into the SM10000's system management tools to allow for pools of servers to be grouped together and managed as a single object and to provide guaranteed performance levels for groups of processors, disk, memory, and fabric - something that Feldman says virtualized x64 servers cannot do because they often oversubscribe resources to drive up utilization. The name for this capability is called Dynamic Compute Allocation Technology, or DCAT.
The combination of the ASIC and the FPGA removes 90 per cent of the components in a normal server stack, according to Sea Micro. Such that a 10U chassis with 64 SeaMicro server boards can replace 40 1U, two-socket x64 servers, two Gigabit Ethernet switches, two terminal servers, and a load balancer - what you would cram into a standard 42U rack these days running hyperscale, Webby workloads. And the SM1000 will draw one quarter of the power and therefore require one quarter of the cooling.