Well, recess is over for the "Roadrunner" massively parallel supercomputer running at Los Alamos National Laboratory. It is time to get on with nuclear simulations that circumvent the Nuclear Test Ban Treaty.
Today, Los Alamos announced that the Roadrunner super - which is a clustered blade server that mixes Opteron-based blades running Linux with multiple Cell-based blades for number crunching - has finished up its shakedown period running various workloads on the 1.1 petaflops super, and is now ready to report for duty to its Department of Energy overlord. It's "beginning its transition to classified computing to assure the safety, security, and reliability of the US nuclear deterrent."
The DOE supercomputer centers steal a lot of the HPC headlines, but the DOE never gets into specifics of what it is really up to. It is widely believed that the classified work that the major DOE labs are doing with these supers is simulating how the US nuclear arms arsenal decays over time and designing and simulating new nuclear weapons that will not need to be tested in the field, but have been designed and simulated inside of a supercomputer. Just like Boeing's 777, the first commercial jetliner designed completely using 3D graphics and not using full-scale mockups. (Personally, I don't know what makes me more nervous: A 777 or a nuke bomb that wasn't beta tested with physical parts first).
DOE doesn't want to talk about any of that, of course, but it wanted to point out today that as part of Roadrunner's shakedown, the techies at Los Alamos allowed ten different petascale workloads to run around like a crazy bird on the cluster.
In its six-month shakedown period, which ended in September, Roadrunner hosted the largest model of an expanding and accelerating universe as a way to try to figure out where we put all the dark matter and dark energy. (Silly, it is in the black ops budget). The super was also used to map genetic sequences to create an HIV family tree in an effort to come up with a vaccine for AIDS, and it was used to simulate the interactions of lasers and plasmas as part of an effort to come up with controlled nuclear fusion. (Rather the barely uncontrolled kind that comes on the tip of a missile).
Roadrunner was also used to simulate how the single atoms moving around in nanowires can cause them to break or change their mechanical and electrical properties. The machine was also used to run a simulation called Spasm, which simulated the interactions of multiple billions of atoms as shockwave stresses smash the materials to bits, shrink them, swell them, or otherwise deform them.
The National Nuclear Safety Administration has plunked big supers into Los Alamos, Sandia National Laboratory, and Lawrence Livermore National Laboratory so they can work in America's nuclear deterrence using simulations. All of these labs have access to the Roadrunner machine, as well as some pretty hefty iron of their own. And, by the way, there is never enough iron, even when there is a recession and Uncle Sam is writing rubber checks.
The Roadrunner machine is currently reckoned by the Linpack Fortran benchmark test to be the most powerful supercomputer in the world, but that could change at Supercomputing 09 next month, when the fall list comes out. There is a good chance that the "Jaguar" Cray XT5 parallel Opteron cluster will squeak by Roadrunner in terms of raw sustained performance on Linpack.
Roadrunner is based on a tweaked version of IBM's BladeCenter blade servers. Each computational node has two dual-core Opteron 2210 processors running at 1.8 GHz; these nodes link out over PCI-Express buses to two other blades based on IBM's Cell Power8Xi co-processor. The basic idea is to give each core in the node its own Cell processor to use as a math unit. Each Cell chip has a 64-bit Power core and eight vector math units. IBM was expected to get a kicker Cell chip out the door that packs two Power cores and 32 vector units onto a single chip on something called the QS2Z blade, delivering 1 teraflops of oomph of double precision for each blade with two of these new Cell chips.
That is five times the performance of the current Cell blades used in Roadrunner. Couple that with some twelve-core "Magny-Cours" Opterons, and IBM could probably get Los Alamos a machine that can scale to more than 5 or 10 petaflops of sustained performance, provided the switching is upgraded to quad-data rate InfiniBand to link the whole thing together and there are enough PCI-Express ports on the blades to double up or quadruple the Cell blades. (You need 24 Cell chips for a two-socket blade using 12-core Magny-Cours chips, and that would be a dozen PCI-Express slots per blade. This seems a bit much to cram onto a blade.)
The QSZ2 was slated for delivery in the first half of 2010 according to some old IBM roadmaps, but Big Blue hasn't said anything about it for years.
For its part, rival DOE lab Oak Ridge National Laboratory - which doesn't do nuke bomb simulations but does other nuke stuff - said recently that it would be using Nvidia's forthcoming "Fermi" graphics processors as co-processors for a forthcoming parallel super that would scale to around 10 petaflops.
Oak Ridge currently uses the XT5 system from Cray, which is powered by Advanced Micro Devices' Opteron processors, and the lab has not said if it will be upgrading this machine with more recent Opterons and adding in Fermi co-processors or starting from scratch and building a brand new box. ®