Cray inks $45m super pact with DoD

So, three Bakers enter the Pentagon...


Supercomputer maker Cray has staked a lot of its financial 2010 on its future "Baker" massively parallel Opteron-based servers and their new "Gemini" interconnect. And today, 2010 got off to a good start as Cray announced that the US Department of Defense has forked over more than $45m in tax dollars to drop three new Baker machines into three DoD super centers.

The Baker machines were acquired as part of the HPC Modernization Program that the DoD established in 1992 to keep its supercomputers up to speed each year. Each year, Congress kicks in funds to make sure that the DoD's supers stay, well, super. The feeds and speeds of the three Baker systems are not being divulged, but Cray said last week that the Baker machines will be built using the same "Magny-Cours" Opteron 6100 blades used in the forthcoming XT6 and XT6m machines based on an earlier SeaStar2+ interconnect.

Those Opteron 6100 blades will pack about twice the number-crunching oomph of the blades used in the current XT5 and XT5m. (This stands to reason, since the latter machines use six-core Opteron 2400 processors and the Magny-Cours Opteron 6100s will sport a dozen cores made possible by cramming two six-core chips into a single package). The real new thing with the Baker systems is the "Gemini" interconnect, which also presumably doubles up bandwidth and connectivity over the current interconnect, given its name.

Cray has said very little about Gemini publicly. Cray should therefore be able to put a box in the field in the third quarter of this year, when Baker shipments start, that is theoretically capable of at least 3.5 petaflops of sustained performance.

Sources at Cray would not divulge the sizes of the three machines the Pentagon is picking up or whether these boxes were at the front of the Baker line, but given how much dough Uncle Sam kicks Cray's way, it is hard to believe the DoD would not be at the front of the line.

The Baker boxes will be installed at the Air Force Research Laboratory (located at the Wright-Patterson Air Force Base in Ohio), the Arctic Region Supercomputing Center (in Fairbanks, Alaska, at the University of Alaska), and the Army Engineer Research and Development Center (Vicksburg, Mississippi). The Baker machines will support basic and applied research, and product development, and evaluation for the military, including new fuel, armor, and weapons systems development as well as being used to simulate long-term global weather.

The six HPC centers run for the US military also include the Army Research Laboratory (at Aberdeen, Maryland), the Navy DoD Supercomputing Resource Center (at the Stennis Space Center in Mississippi), and the Maui High Performance Computing Center in Hawaii. The six centers together have over 700 million processor-hours of compute capacity per year to run workloads at the moment. It is unclear how much incremental power the three new Baker systems will deliver. The DoD is also not saying if the Baker boxes will kick out other gear or augment what is in place.

The Air Force lab has a Hewlett-Packard cluster with 2,048 Opteron cores, called Falcon and rated at 11.5 teraflops, and a Silicon Graphics Altix 4700 shared memory system with 9,216 Itanium cores, called Hawk and rated at 59 teraflops. It is the only lab of the six that does not have any Cray iron already.

The Arctic Region Supercomputing Center already has a 31.8 teraflop Cray XT5 system with 3,456 Opteron cores (named Pingo), plus a Sun Microsystems Opteron cluster with 2,280 Opteron cores rated at 12 teraflops (named Midnight) and an IBM Cell-based research prototype (called Quasar) with a dozen of the two-socket QS22 Cell blades.

The Army's Engineer Research and Development Center is also already a Cray shop too, so the award of Baker systems is not a huge surprise. ERDC's XT3 system, named Sapphire, has 8,192 Opteron cores and is rated at 42.6 teraflops. For its time, Sapphire was one of the largest MPP machines in the world. The XT4 system at the facility, named Jade, used quad-core Opterons for a total of 8,584 cores and 72.3 teraflops.

Instead of going to the XT5, ERDC bought an SGI Altix ICE Xeon cluster with 15,360 cores with QDR InfiniBand lashing it all together. This machine, named Diamond, is rated at 172 teraflops. If that SGI coffee cup with the Altix UV logo that is presumably sitting on the chief information officer's desk at ERDC didn't get the lab a great deal on a Baker XT7 system (if that is what the Bakers will indeed be called), nothing will.

The Army Research Lab at Aberdeen Proving Ground in Maryland bought three Altix ICE Xeon clusters as part of the HPC Modernization Program's 2009 budget cycle, including Harold, a 10,752-core cluster rated at 109.3 teraflops; TOW (after the missile), a 6,656-core cluster weighing in at 74.5 teraflops; and a baby development machine with only 96 cores named Icecube and no oomph to speak of.

This Army lab also has a bunch of different Linux Networx clusters with several thousands cores across all of the machines (SGI ate the carcass of Linux Networx in February 2008). The ARL also has two Cray XT5 machines, one for production called MRAP, with 10,400 cores rated at 95.7 teraflops) and a baby machine for development. (MRAP is short for Mine Resistant Ambush Protected, and refers to Humvee and other vehicles used by the Army).

The Navy's super center in Mississippi is also a Cray shop already, and it has an XT5 box named Einstein with 12,872 cores, plus two Power-based clusters from IBM: the Power6 cluster called DaVinci has 5,312 cores and the Power5+ cluster called Babbage has 1,792 cores. The Navy didn't provide individual ratings for those machines, but together these boxes have 243 teraflops of number-crunching power.

The Maui supercomputer center also has Cray iron, but it is an older six-node Cray XD1 with only 144 cores. The big box in Maui is called Mana, and it is a cluster of Dell PowerEdge servers with 9,216 cores rated at 103 teraflops.

When you add it all up, the six labs managed by the Pentagon have just over 1 petaflops of aggregate peak performance to deploy. Based on the $19.9m price tag to Oak Ridge National Laboratory paid to upgrade from 1 petaflops to 1.75 petaflops on is XT5 system last year, or a little over $27,000 per teraflops, you'd expect the DoD to get somewhere around 1.65 petaflops for the $45m it is shelling out.

And if you want to take into account Moore's Law and price competition, you'd expect something on the order of 2 petaflops. But the Baker machines are entirely new machines, not an upgrade, so maybe it will work out to something like 1 petaflops of extra oomph. ®

Broader topics


Other stories you might like

  • Germany to host Europe's first exascale supercomputer
    Jupiter added to HPC solar system

    Germany will be the host of the first publicly known European exascale supercomputer, along with four other EU sites getting smaller but still powerful systems, the European High Performance Computing Joint Undertaking (EuroHPC JU) announced this week.

    Germany will be the home of Jupiter, the "Joint Undertaking Pioneer for Innovative and Transformative Exascale Research." It should be switched on next year in a specially designed building on the campus of the Forschungszentrum Jülich research centre and operated by the Jülich Supercomputing Centre (JSC), alongside the existing Juwels and Jureca supercomputers.

    The four mid-range systems are: Daedalus, hosted by the National Infrastructures for Research and Technology in Greece; Levente at the Governmental Agency for IT Development in Hungary; Caspir at the National University of Ireland Galway in Ireland; and EHPCPL at the Academic Computer Centre CYFRONET in Poland.

    Continue reading
  • Los Alamos to power up supercomputer using all-Nvidia CPU, GPU Superchips
    HPE-built system to be used by Uncle Sam for material science, renewables, and more

    Nvidia will reveal more details about its Venado supercomputer project today at the International Supercomputing Conference in Hamburg, Germany.

    Venado is hoped to be the first in a wave of high-performance computers that use an all-Nvidia architecture, in this case using Grace-Hopper Superchips that combine CPU and GPU dies, and Grace CPU-only Superchips.

    This supercomputer "will be the first system deployed not just with Grace-Hopper in terms of the converged Superchip but it’ll also have a cluster of Grace CPU-only Superchip modules,” Dion Harris, Nvidia’s head of datacenter product marketing for HPC, AI, and Magnum IO, said during an Nvidia press conference ahead of ISC.

    Continue reading
  • HPE Q2 revenue growth held back by supply constraints
    'However, enterprise demand continues to persist across our entire portfolio,' says CEO

    Amid a delayed HPC contract and industry-wide supply limitations compounded by the lockdown in Shanghai, Hewlett Packard Enterprise reported year-on-year sales growth of $13 million for its Q2.

    That equated to revenue expansion of 1.5 percent to $6.713 billion for the quarter ended 30 April. Wall Street had forecast HPE to generate $6.81 billion in sales for the period and didn't look too kindly on the shortfall.

    "This quarter," said CEO and president Antonio Neri, "through a combination of supply constraints, limiting our ability to fulfill orders as well as some areas where we could have executed better, we did not fully translate the strong customer orders into higher revenue growth."

    Continue reading
  • Red Hat helps US Department of Energy containerize supercomputing
    You might say the US agency needed an OpenShift in mindset

    Cloud-native architectures have changed the way applications are deployed, but remain relatively uncharted territory for high-performance computing (HPC). This week, however, Red Hat and the US Department of Energy will be making some moves in the area.

    The IBM subsidiary – working closely with the Lawrence Berkeley, Lawrence Livermore, and Sandia National Laboratories – aims to develop a new generation of HPC applications designed to run in containers, orchestrated using Kubernetes, and optimized for distributed filesystems.

    The work might also make AI/ML workloads easier for enterprises to deploy in the process.

    Continue reading

Biting the hand that feeds IT © 1998–2022