In the IT business, volume is everything. And when people talk about standardization, this is what they really mean. Liquid Computing - a company that entered the server space with such a big splash (and a very sophisticated design) - has learned this lesson the hard way. And with its second rev of products, the company has ditched its proprietary server interconnection scheme and adopted Ethernet.
Back in 2003, Liquid Computing was founded by two Canadian engineers from telecom equipment maker Nortel who also had experience building supercomputers for the U.S. government's Defense Advance Research Project Agency. At the end of 2005, the company launched its first alpha product, LiquidIQ, based on Advanced Micro Devices' Opteron processors and using a homegrown interconnection scheme called IQInterconnect.
IQInterconnect interfaced with the Opteron's HyperTransport to create a box that could be configured, on the fly, as a parallel supercomputer cluster running the MPI protocol. Or carved up into virtual SMP slices. Or turned into a mix of SMP nodes running MPI and clustered to each other. (That's liquid part of the Liquid Computing name). The company also moved its headquarters to Los Altos, California, presumably to be closer to the
money customers it was targeting.
The IQInterconnect fabric, which linked the Opteron server nodes together, was proprietary. This is not a problem if the technology becomes widely adopted or, better still, becomes a standard by virtue of its volume. The IQInterconnect was implemented in inter-chassis switch modules that delivered up to 16 GB/sec of bi-directional bandwidth between the compute modules at a latency of under 2 microseconds.
The architecture of the switch fabric allowed up to 17 chassis of blade-style servers - supporting over 960 Opteron sockets (back when they were still single core) - to be lashed together. SMP scalability was the limit of the operating systems, not the hardware. (Or so said Liquid Computing at the time). The compute blades were themselves originally based on four-socket Opteron 800 boards.
With the LiquidIQ 2.0 products announced this week, Liquid Computing has done something that must have been very difficult to do: Let go of its IQInterconnect and replaced it with Ethernet switches. (Considering the amount of time and money and brains that went into creating it, this must have been very difficult indeed).
According to Keith Miller, vice president of product management at the company, a proprietary architecture was suitable for the high performance computing customers that Liquid Computing was targeting. But if you want to run off-the-shelf Linux distros and Windows and sell to a broader market, you have to support some other networking scheme. Ethernet is the default commercial interconnection for servers the world over, even if it does not have many of the advantages of IQInterconnect, such as being able to couple SMP nodes tightly for 16 or 32 socket images.
But the idea of going after a broader market, even if the server is less liquid (more slushy?) is undoubtedly making Liquid Computing's venture backers - VenGrowth Capital Partners, ATA Ventures, and Newbury Ventures, who have kicked in $45m to date in two rounds of funding and who are expected to participate in a third round in 2009 - a whole lot happier. Anyway, Miller says, the customers that Liquid Computing was talking to in its pilot tests were far more interested in the platform's integrated management tools than the flexible SMP scalability.
"It is a testament to the architecture that you can switch out the communications system and not touch the architecture," says Miller. And like many others, Miller sees 10 Gigabit Ethernet moving ahead "at a breakneck pace" and InfiniBand being relegated to a protocol for HPC clusters, "even though some people are trying to make a go of it as a unified fabric." If InfiniBand can't get established in the data center, IQInterconnect, no matter how slick, had even less of a chance.
With the LiquidIQ 2.0 machinery, the basic chassis design remains the same. Each chassis has 10 compute modules (a big blade) in the front and 10 in the back, and a rack holds two of these chassiss, for a total of 40 modules per rack. The compute modules can be based on two-socket Opteron 2200 (dual-core) or 2300 (quad-core) or four-socket Opteron 8200 (dual-core) or 8300 (quad-core) processors.
The compute modules support up to 64 GB of main memory (16 memory slots using 4 GB DIMMs) and have a single Gigabit Ethernet port and four 10 Gigabit Ethernet ports. The Ethernet bandwidth, at 84 Gbit/sec for module-to-module links, is a lot less than the IQInterconnect, which offered 100 GB/sec of bandwidth between modules.
Miller says that Intel chips - which the company had hoped to support years ago - would plug into the box next year. And while Miller would not admit this, that means the LiquidIQ architecture really needs a HyperTransport-like interconnect for the processors. Something like Intel's QuickPath Interconnect, which debuts with its "Nehalem" Xeons next year.
The good news is, the LiquidIQ 2.0 machines run more than a modified version of Red Hat Enterprise Linux 4, as the original machines that shipped in November 2006 did. The boxes support Windows Server 2003 and Windows Server 2008 as well as RHEL 4 and 5 and Oracle Enterprise Linux 5.1. On the hypervisor front, Microsoft's HyperV is supported, as is Oracle's eponymous VM clone of the open source Xen hypervisor and VMware's ESX Server 3.5. The company has worked with NetApp to tightly integrate its NAS disk arrays into the system, but also supports SAN and NAS arrays made by IBM, EMC, Dell, and a few lesser known players.
A base configuration of the LiquidIQ system with a couple of compute modules and a couple of Ethernet switches sells for around $50,000, according to Miller. A full chassis with 20 compute nodes and a reasonable number of switches costs around $300,000. The company has a handful of customers using the boxes in production now and has a lot of proof-of-concepts underway, which it hopes to flip to sales soon.
For now, Liquid Computing's main selling point is that these boxes are much easier to provision and manage using its out-of-band management tools. One customer using the machines is taking 2 to 3 days to provision a new server for customers instead of taking 45 days using manual processes and less sophisticated tools. ®