NOAA goes to Cray for climate super
A Baker sticks a wet finger in the air
The people of the Tennessee Valley who depend upon the eponymous hydroelectric plants for their electricity may soon start seeing brownouts when what looks like another petaflops-class Cray supercomputer is plunked down at the Oak Ridge National Laboratory.
The Oak Ridge boys already have the 1.76 petaflops XT5 "Jaguar" massively parallel Opteron super and also take care of the "Kraken" 1.03 petaflops XT5 super owned by the University of Tennessee. These XT5 machines are based on the six-core "Istanbul" Opteron 8400 processors from Advanced Micro Devices (which were launched last summer) and make use of Cray's SeaStar2+ interconnect to lash thousands of blade servers together in a parallel cluster.
Now, Oak Ridge is being asked to babysit a new XT6 super based on AMD's newest twelve-core "Magny-Cours" Opteron 6100 processors, to be used by the National Oceanic and Atmospheric Administration for weather modeling — not short-term weather prediction, but long-term climate modeling.
The new Cray machine, which does not have a catchy name but is rather boringly to be called Climate Modeling and Research System, is being paid for jointly by the US Department of Energy, which runs the biggest supercomputer labs in the country, and the Department of Commerce, which controls NOAA but which does not use supercomputers to model the economy. The Obama Administration stimulus package, enacted into law a little more than a year ago as the American Recovery and Reinvestment Act, is footing the $47m bill.
The CMRS system will be built in two phases. It will begin life as an XT6 super with the eight-socket Opteron 6100 blade servers with four SeaStar2+ interconnect ports on the blade. That's one interconnect chip for each pair of sockets — you can see more about the XT6 blades here.
This XT6/SeaStar2+ super will be installed sometime during the second half of 2010, which means Cray has a chance of booking at least some of that $47m in revenues this year, which the company is very keen on doing with sales in a slump until its next-generation "Gemini" interconnect appears — and it's not due until early in the third quarter. As El Reg previously reported, Cray is only expecting to book around $30m in revenues in the second quarter and around $50m in the third quarter as companies await its next-generation "Baker" systems, which will marry the XT6 blades to the Gemini interconnect. Cray has been a bit mysterious about the Gemini interconnect, but presumably it pairs up multiple SeaStars and doubles up the bandwidth and node connectivity.
In the second phase of the CMRS contract with NOAA, the existing XT6 box will be upgraded to a Baker system and a second Baker box will be dropped in as well. This second phase is not expected to be completed until 2011.
Additional upgrades are planned for the CMRS machine in 2012, no doubt including a shift to whatever "Bulldozer" Opterons are current at the time — AMD is expected to ship its 16-core "Interlagos" Opterons, presumably to be called the Opteron 6200s. In 2012, AMD will put out a new chipset and possibly a new socket (comprising a new platform) as well as a new chip, presumably the Opteron 6300, which gets iterated in 2013, to what we will call the Opteron 6400.
If core counts keep growing at the current rates, the 2012 chip that might end up in the CMRS system should have 24-core sockets and probably eight sockets per blade. If this turns out to be true, then within the same footprint, this future CMRS box should have twice the cores as the initial XT6 blade box. And if Gemini is really two SeaStar2+ interconnects linked together, this would then be a balanced system.
By the way, the NOAA deal is one of the two big ones that Cray said were in the middle of being negotiated for a combined value of $90m when it reported its first quarter financial results earlier in May. That leaves another $43m deal to be closed to keep Cray on track to hit its goal of between $305m and $325m in sales and profitability in 2010. Cray has to book about $200m in revenues in Q4 to hit these targets — a tough but doable proposition, provided the Baker systems and their Gemini interconnect perform as expected, and customers accept the boxes and therefore cut Cray the checks.
Cray did not say how powerful the initial or final CMRS system would be in terms of petaflops, but based on recent XT6 deals, $45m or so gets you something on the order of 1 petaflop. ®