The final piece of Cray's "Baker" XE6 massively parallel supercomputers, on which the company's financial 2010 hinges, make its debut today.
The unveiling comes a week ahead of the International Super Computing conference in Hamburg, Germany, and at Cray's user group meeting in Edinburgh, Scotland.
Cray has not said much publicly about what is perhaps the key piece of technology in the Baker systems, which would be the "Gemini" XE system interconnect. That's because Cray can't afford to make promises it cannot deliver upon and electronic components are notoriously hard to bring to market on time and with the performance that high-end customers like the world's largest supercomputer labs expect.
Cray got burned for many quarters three years ago with delays by Advanced Micro Devices in bringing its quad-core "Budapest" Opteron 2000 processors to market for the XT4 systems. Then-new Cray chief executive officer, Peter Ungaro, had to cope with revenue declines in the wake of the delays, and he doesn't want to repeat that nightmare again.
But in recent months as Cray has reported its financial results, Ungaro has been sounding more and more optimistic about the Gemini interconnect while always couching everything he says with reminders that the ASIC that makes up the Gemini interconnect could still have bugs that won't be found until final testing, which could force tweaking of the chip design and another round of fabbing and testing. This would pushing out sales by one or two quarters, and basically doing a repeat of the Budapest fiasco.
The fact that Cray is talking about Gemini and has launched over $200m in deals so far that have Baker machines for at least part of the deal means that Gemini is coming along. But you won't hear Cray sounding cocky about that until it is in the field and systems using the interconnect have passed muster at the US super labs and generated revenues and profits.
According to Barry Bolding, vice president of scalable systems at Cray, the Baker supers were originally designed as an integrated system featuring a new style of cabinet featuring a phase-change liquid cooling exchanger to suck heat out of the racks; the use of AMD's G34-socket Opterons in a blade sporting four two-socket servers on a single blade; a new Linux environment that masked the proprietary system interconnect from Linux and the parallel applications that run atop it, and a new high-speed interconnect. The Baker systems had a kind of fluid delivery schedule, but were originally due around 2009.
The Cray XE6 blade: Two Gemini interconnects on the left (which is the back of the blade), with four two-socket server nodes and their related memory banks
After the Budapest fiasco (that's El Reg's words, not Bolding's), Cray decided to break the Baker system into bits and pieces and go modular, rolling out what it could when it could instead of doing a big bang system. So the funky Baker cabinets came out with the XT5 machines as the Ecophlex.
The Opteron G34-based blades were previewed last fall as the XT6 blades, sporting the SeaStar2+ interconnect instead of the Gemini interconnect, which conveniently slot into the same physical space on the blades. As soon as AMD had the 12-core "Magny-Cours" Opteron 6100s ready in March, Cray started shipping XT6 blades rather than waiting another six months or so for the Bakers to be whole and complete.
The third generation of the Cray Linux Environment, a goosed version of Novell's SUSE Linux Enterprise Server 11 that was originally expected only on Baker boxes and their Gemini interconnect, made its debut last month sporting a neat new feature called Cluster Compatibility Mode. With the CCM feature, the SeaStar interconnect (and now also the Gemini interconnect) has been equipped with drivers that make Linux think it is talking to an Ethernet networking instead of SeaStar or Gemini, which is a very different kind of animal.
Right now, the CCM feature of Cray Linux Environment 3.0 is supported on XT6 and XT6m mini variants and only emulating Ethernet, but will be backported to XT5 and XT5m machines later this year and to XT4 machines in early 2011; eventually, CCM will be able to emulate InfiniBand as well, if your HPC apps prefer that.
Which brings us all the way up to the Gemini interconnect that completes the Baker XE6 systems.