Supercomputer maker Cray has staked a lot of its financial 2010 on its future "Baker" massively parallel Opteron-based servers and their new "Gemini" interconnect. And today, 2010 got off to a good start as Cray announced that the US Department of Defense has forked over more than $45m in tax dollars to drop three new Baker machines into three DoD super centers.
The Baker machines were acquired as part of the HPC Modernization Program that the DoD established in 1992 to keep its supercomputers up to speed each year. Each year, Congress kicks in funds to make sure that the DoD's supers stay, well, super. The feeds and speeds of the three Baker systems are not being divulged, but Cray said last week that the Baker machines will be built using the same "Magny-Cours" Opteron 6100 blades used in the forthcoming XT6 and XT6m machines based on an earlier SeaStar2+ interconnect.
Those Opteron 6100 blades will pack about twice the number-crunching oomph of the blades used in the current XT5 and XT5m. (This stands to reason, since the latter machines use six-core Opteron 2400 processors and the Magny-Cours Opteron 6100s will sport a dozen cores made possible by cramming two six-core chips into a single package). The real new thing with the Baker systems is the "Gemini" interconnect, which also presumably doubles up bandwidth and connectivity over the current interconnect, given its name.
Cray has said very little about Gemini publicly. Cray should therefore be able to put a box in the field in the third quarter of this year, when Baker shipments start, that is theoretically capable of at least 3.5 petaflops of sustained performance.
Sources at Cray would not divulge the sizes of the three machines the Pentagon is picking up or whether these boxes were at the front of the Baker line, but given how much dough Uncle Sam kicks Cray's way, it is hard to believe the DoD would not be at the front of the line.
The Baker boxes will be installed at the Air Force Research Laboratory (located at the Wright-Patterson Air Force Base in Ohio), the Arctic Region Supercomputing Center (in Fairbanks, Alaska, at the University of Alaska), and the Army Engineer Research and Development Center (Vicksburg, Mississippi). The Baker machines will support basic and applied research, and product development, and evaluation for the military, including new fuel, armor, and weapons systems development as well as being used to simulate long-term global weather.
The six HPC centers run for the US military also include the Army Research Laboratory (at Aberdeen, Maryland), the Navy DoD Supercomputing Resource Center (at the Stennis Space Center in Mississippi), and the Maui High Performance Computing Center in Hawaii. The six centers together have over 700 million processor-hours of compute capacity per year to run workloads at the moment. It is unclear how much incremental power the three new Baker systems will deliver. The DoD is also not saying if the Baker boxes will kick out other gear or augment what is in place.
The Air Force lab has a Hewlett-Packard cluster with 2,048 Opteron cores, called Falcon and rated at 11.5 teraflops, and a Silicon Graphics Altix 4700 shared memory system with 9,216 Itanium cores, called Hawk and rated at 59 teraflops. It is the only lab of the six that does not have any Cray iron already.
The Arctic Region Supercomputing Center already has a 31.8 teraflop Cray XT5 system with 3,456 Opteron cores (named Pingo), plus a Sun Microsystems Opteron cluster with 2,280 Opteron cores rated at 12 teraflops (named Midnight) and an IBM Cell-based research prototype (called Quasar) with a dozen of the two-socket QS22 Cell blades.
The Army's Engineer Research and Development Center is also already a Cray shop too, so the award of Baker systems is not a huge surprise. ERDC's XT3 system, named Sapphire, has 8,192 Opteron cores and is rated at 42.6 teraflops. For its time, Sapphire was one of the largest MPP machines in the world. The XT4 system at the facility, named Jade, used quad-core Opterons for a total of 8,584 cores and 72.3 teraflops.
Instead of going to the XT5, ERDC bought an SGI Altix ICE Xeon cluster with 15,360 cores with QDR InfiniBand lashing it all together. This machine, named Diamond, is rated at 172 teraflops. If that SGI coffee cup with the Altix UV logo that is presumably sitting on the chief information officer's desk at ERDC didn't get the lab a great deal on a Baker XT7 system (if that is what the Bakers will indeed be called), nothing will.
The Army Research Lab at Aberdeen Proving Ground in Maryland bought three Altix ICE Xeon clusters as part of the HPC Modernization Program's 2009 budget cycle, including Harold, a 10,752-core cluster rated at 109.3 teraflops; TOW (after the missile), a 6,656-core cluster weighing in at 74.5 teraflops; and a baby development machine with only 96 cores named Icecube and no oomph to speak of.
This Army lab also has a bunch of different Linux Networx clusters with several thousands cores across all of the machines (SGI ate the carcass of Linux Networx in February 2008). The ARL also has two Cray XT5 machines, one for production called MRAP, with 10,400 cores rated at 95.7 teraflops) and a baby machine for development. (MRAP is short for Mine Resistant Ambush Protected, and refers to Humvee and other vehicles used by the Army).
The Navy's super center in Mississippi is also a Cray shop already, and it has an XT5 box named Einstein with 12,872 cores, plus two Power-based clusters from IBM: the Power6 cluster called DaVinci has 5,312 cores and the Power5+ cluster called Babbage has 1,792 cores. The Navy didn't provide individual ratings for those machines, but together these boxes have 243 teraflops of number-crunching power.
The Maui supercomputer center also has Cray iron, but it is an older six-node Cray XD1 with only 144 cores. The big box in Maui is called Mana, and it is a cluster of Dell PowerEdge servers with 9,216 cores rated at 103 teraflops.
When you add it all up, the six labs managed by the Pentagon have just over 1 petaflops of aggregate peak performance to deploy. Based on the $19.9m price tag to Oak Ridge National Laboratory paid to upgrade from 1 petaflops to 1.75 petaflops on is XT5 system last year, or a little over $27,000 per teraflops, you'd expect the DoD to get somewhere around 1.65 petaflops for the $45m it is shelling out.
And if you want to take into account Moore's Law and price competition, you'd expect something on the order of 2 petaflops. But the Baker machines are entirely new machines, not an upgrade, so maybe it will work out to something like 1 petaflops of extra oomph. ®