Niche supercomputer supplier Appro International has bagged the largest deal in its history, having won a massive contract from the US Department of Energy to supply up to 6 petaflops of raw computing power to three of its nuke labs for the next several years.
Appro was not allowed to divulge the price tag on the clusters that Tri-Labs will be installing, but John Lee, vice president of advanced technology solutions at the company, said the deal with the three nuke labs – Lawrence Livermore, Los Alamos, and Sandia National Laboratories, often called Tri-Labs – represented the largest procurement it had ever done both in terms of money and aggregate number-crunching power.
Lee added that the bidding for the Tri-Labs procurement was intense, with around a dozen suppliers contending for the work, in his estimation. The bidding at DOE is super secret, so the exact roll of vendors and systems that Appro beat out to win the deal is not known.
"Everybody including your grandmother wanted to bid on this thing," Lee says with a laugh. Companies did presentations to the DOE in early 2010 for the Tri-Labs procurement and the request for proposals went out earlier this year.
The Tri-Labs procurement is based on a future generation of Appro's GreenBlade blade servers, which will be based on two-socket blade servers using Intel's future "Sandy Bridge" Xeon E5 processors, due around the third quarter of this year, if the rumors are right. The blades will be configured with 32 DDR3 memory slots running at 1.6GHz, but Appro was not at liberty to discuss the feeds and speeds of the Xeon processors.
What the company could say is that it was creating pods of computing elements, what it calls "scalable units", that are pre-integrated with QLogic InfiniBand switches running at 40Gb/sec (Quad Data Rate, or QDR) at the Appro factory and then shipped off to the three labs to be assembled using a bi-directional fat tree configuration into a large cluster weighing in at just over 1 petaflops per lab. Brocade Communications is being tapped for management and storage switches as well.
"When all of the options are exercised, we expect that to double to over 6 petaflops," says Lee.
The Tri-Labs deal will consume a big portion of Appro's production capacity, and the DOE facilities will be getting all of the initial production for the future GreenBlade servers using Intel's Xeon E5 processors, according to Lee. The new GreenBlade design will be able to accommodate Intel's future "Knights" family of X64-based coprocessors that will plug into PCI-Express slots, due sometime in the future on Intel's 22 nanometer Tri-Gate production processes. Tri-Labs is not putting these Many Integrated Core, or MIC, coprocessors into the new supercomputer clusters, but one of the labs – Lee would not say which one – has requested that a subset of the GreenBlade nodes be equipped with M2090 fanless GPU coprocessors from Nvidia.
The initial delivery of the supercomputers will begin in the third quarter of this year, and by the first quarter of 2012 the 3 petaflops of aggregate oomph will be up and running across the three labs. By the third quarter of 2012, Lee expects the Tri-Labs systems to be stretched to the full 6 petaflops. The first phase of the procurement, the DOE confirmed, is worth $39m, and the second phase brings the total 6 petaflops machine up to $89m. That's under $15,000 per teraflops – about a third of what it cost a two years ago for petaflops-class machines.
The machines, which will be networked together across the three labs, will be running the Tri-Labs Operating System, a variant of Red Hat Enterprise Linux cooked up by the nuke labs, and Windows HPC Server 2008 R2 is not going to be an option on them. And that means to make it run Crysis, you need to add the WINE runtime environment on top of the Tri-Labs Linux variant.
The Appro machines are designated by the DOE as "capacity computing" resources, which means they do the grunt work for all of the weapons design, physics, and chemistry research done at the labs by many researchers. As such, the capacity machines are designed to be as balanced as they can be for supporting multiple workloads and many users. This is distinct from the DOE's "capability computing" machines, which are usually custom-made boxes with special features aimed at testing a new architecture for a very precise workload or handful of workloads. ®