The Los Alamos National Laboratory has inked a $45m contract with supercomputer maker Cray to supply the nuke researchers with one of the first Baker series of Opteron-based supers that will ship later this year - and to provide some company for the petaflopping RoadRunner Opteron-Cell massively parallel super, custom-built by rival IBM for Los Alamos.
The Baker machines will couple the latest Opteron processors from Advanced Micro Devices with Cray's new 3D torus interconnect, code-named Gemini. The name suggests that the kicker to the current SeaStar+ interconnect will double up the performance of the crossbar and therefore also double the number of nodes that can be clustered together - although Cray has thus far been pretty vague about the capabilities of the Gemini interconnect.
The HPC supplier has also said very little about what will make Baker machines distinct from its impending XT6 supers, which use the twelve-core Magny-Cours Opteron 6100 processors that were announced earlier this week. With twelve-core Opterons and twice the interconnect bandwidth, the Baker machines should be able to deliver around 3.5 petaflops of sustained performance, according to my back-of-envelope calculations.
The machine that is going into Los Alamos is to be nicknamed Cielo, presumably because the sky is the limit but also because the US National Nuclear Security Administration is frustrated by current performance ceilings as it manages the country's arsenal of 6,000 nuclear weapons.
The NNSA is a semi-autonomous agency within the Department of Energy, which makes and manages nukes for Uncle Sam. The major super labs in the States - Los Alamos, Sandia National Laboratories, and Lawrence Livermore National Laboratory - are all affiliated with the NNSA effort, which also has the goal of designing, completely within a supercomputer, new nuclear weapons and simulating their explosions. (You can't do this with real nukes because of the Nuclear Test Ban Treaty, of course.)
In its announcement, the NNSA says it chose the Baker boxes after a "highly competitive procurement process," which no doubt pitted IBM and Cray systems, and perhaps others, against each other. NNSA did not specify who was involved with the bidding, but did say that the Cielo super would be installed in the third quarter of 2010 with additional upgrades to the system expected in 2011. NNSA added that the Cielo massively parallel super could have more than ten times the performance of the ASCI Purple super installed at Lawrence Livermore, which is currently rated at 92.8 teraflops of peak performance. So we are talking petaflops territory for Cielo.
The Purple super is based on IBM's dual-core 1.9GHz Power5+ processors, has 12,208 of them and 48TB of main memory, and is lashed together using IBM's proprietary Federation interconnect for its AIX boxes. Lawrence Livermore has a BlueGene/P super rated at 501.4 teraflops and the original BlueGene/L super, currently rated at 596.4 teraflops. These machines have huge numbers of PowerPC cores and run Linux.
NNSA says that the total value of the Cielo contract will be under $54m, and it looks like Cray will get the lion's share of $45m, with some left over for other things such as storage systems. Panasas, an existing storage supplier to Los Alamos for the RoadRunner system, is supplying its ActiveStor scale-out network-attached storage (NAS), which uses clustered nodes with an overall multi-petabyte capacity to deliver files fast enough to keep the Cielo cores busy.
John Morrison, the lab's high performance computing division leader, said: "The file system provided by Panasas for Cielo will support users at all three National Nuclear Security Administration laboratories, including Lawrence Livermore National Laboratory, Los Alamos National Laboratory, and Sandia National Laboratories in their use of the system."
It's a good win for Panasas, keeping it in the public eye as it fights for business with Isilon, BlueArc, the HP-acquired Ibrix, and others.
The $45m Baker system win at NNSA follows another $45m deal Cray inked in late February with the US Department of Defense to put three Baker/Gemini systems into three different supers used by the Air Force, the Army, and the Arctic Region Supercomputing Center, which does work for the military. El Reg estimated back then that this $45m would have bought about 1 petaflops of number-crunching power for the US military across those three systems, and the price and expected performance of the Cielo machine bears out those estimates. ®