Cray lands $188m Blue Waters NCSA contract

Breaking 10 petaflops with Opteron-GPU tag team

SC11 If there was a reason that Cray CEO Peter Ungaro, who formerly ran IBM's high performance computing biz, was a little extra perky when the company announced its third quarter financials two weeks ago, it was not just because the SC11 supercomputing trade show was coming to Cray's hometown of Seattle this week. Or that Advanced Micro Devices was once again late with an Opteron launch and had hurt its numbers.

It was because Ungaro knew that the National Science Foundation had ordered the budget for the "Blue Waters" petaflops-scale super to be rejigged and sent out for rebid – and Cray was in the hunt to win what turns out to be a $188m deal.

Bill Kramer, deputy project director of the Blue Waters project at the National Center for Supercomputing Applications at the University of Illinois, tells El Reg that Blue Waters was not a specific system, but rather a complete set of infrastructure, including a data center, plus computation, networking, and storage and, most importantly given the software goals of the NCSA, code that scales to real-world petaflops performance.

"We wanted to change the computational elements, and we got approval after a peer review," says Kramer.

So out goes IBM's super-dense Power 775 cluster version of Blue Waters, which Big Blue pulled the plug on back in August because the Power7-based machine was more expensive to manufacture than the company thought when it won the competitive bidding for the project back in 2007.

In comes the largest system that Cray has built in its history – at least until the 10 to 20 petaflops "Titan" ceepie-geepie hybrid that Cray is building for Oak Ridge National Laboratory is installed and if it is fully extended.

Like Titan, the Blue Waters system that Cray is building for NCSA will be a mix of standard XE6 Opteron blade server nodes and XK6 mixed CPU-GPU nodes, all linked together in a single network using the "Gemini" XE interconnect created by Cray. The XE6 blades will be equipped with eight sockets of 16-core "Interlagos" Opteron 6200 processors and the XK6 blades will have four Opteron sockets and one Nvidia GPU coprocessor per blade.

As with Titan, the Blue Waters system that Cray has pitched to NCSA will be based on Nvidia's next-generation "Kepler" GPUs, which were expected by the end of this year when Nvidia outed its roadmap unexpectedly in September 2010 but which are clearly coming sometime in 2012, not in time for Christmas shopping this year.

The Kepler GPUs will offer a significant performance bump over the current 512-core "Fermi" graphics engines that are used in Nvidia's GeForce and Quadro video cards and Tesla coprocessors. They are fabbed by Taiwan Semiconductor Manufacturing Corp and use its 28 nanometer processes, which should allow Nvidia to put a lot more cores into a GPU as well as boost the clock speeds a bit. Nvidia hasn't said much about the Kepler GPU design, but it looks like notebook makers are going to get first stab at them early next year and that performance will be more than double what the Fermi GPUs do today. The top-end Fermi GPUs with all 512 cores running at 1.3GHz can deliver 665 teraflops of double-precision floating point performance.

The specifications of the rebooted Blue Waters system are still a bit in flux, but here's what NCSA is getting for that $188m. The machine will have at least 235 XE6 cabinets, more than 30 XK6 cabinets, and more than 30 storage and I/O server cabinets. The resulting machine will have more than 49,000 Opteron 6200 processors and more than 380,000 cores, with another 3,000 Nvidia GPU coprocessors packing a hell of a floating point wallop as well. (If the Keplers offer twice the double-precision floating point performance as the Fermis, then those 3,000 GPU coprocessors will account for about 4 petaflops of aggregate oomph.) The plan is to put 4GB of main CPU memory per core into the machine, for a total of more than 1.5PB.

The XE6 and XK6 nodes will all be linked to each other using the XE interconnect through a 3D torus, which will require over 9,000 wires to link it all together. (That's around 4,500 kilometers of wire if you strung it all out.) The Blue Waters machine is expected to have 11.5 petaflops of aggregate peak number-crunching performance.

Kramer says that the plan is to run the 16-core Opteron 6200s processors with half the cores sleeping a lot of the time and giving the remaining eight cores full access to floating point units that are shared by each integer unit in the Bulldozer modules. By doing this, NCSA will be able to run the cores in a Turbo Core mode that can add anywhere from 600MHz to 1.3GHz of extra clocks to the chip, depending on the Opteron 6200 processor that NCSA chooses for the machine.

"We studied this quite a bit, but for a lot of our applications, it makes sense to run it like an eight-core with dual 128-bit floating point performance," says Kramer.

Cray is also building a storage subsystem for the Blue Waters machine, which will run the Lustre file system and deliver 25PB of usable disk capacity. It is not clear how that local storage will be linked into the system, that 25PB of capacity will be accessible through more than 1TB/sec of aggregate bandwidth from the cluster. There is an additional 500PB of nearline storage also being added to the machine, which will have 100GB/sec of bandwidth into the cluster. The external network coming into the Blue Waters machine will have 100Gb/sec of bandwidth at first, and will eventually scale up to 300Gb/sec.

The Blue Waters machine will run the Cray Linux Environment, the company's tweak on SUSE Linux 11 that also has an Ethernet network compatibility mode that allows applications compiled for Linux machines clustered using Ethernet to run unmodified and in emulation mode on top of the XE interconnect.

The Blue Waters machine will be delivered by Cray in phases over the next six to nine months, with the initial delivery early next year and early program deployment by the middle of 2012. The plan is to have the full machine operational by the end of 2012, which is more or less the same timing that NCSA expected with the IBM Power7 Blue Waters behemoth. ®

Narrower topics

Other stories you might like

  • Robotics and 5G to spur growth of SoC industry – report
    Big OEMs hogging production and COVID causing supply issues

    The system-on-chip (SoC) side of the semiconductor industry is poised for growth between now and 2026, when it's predicted to be worth $6.85 billion, according to an analyst's report. 

    Chances are good that there's an SoC-powered device within arm's reach of you: the tiny integrated circuits contain everything needed for a basic computer, leading to their proliferation in mobile, IoT and smart devices. 

    The report predicting the growth comes from advisory biz Technavio, which looked at a long list of companies in the SoC market. Vendors it analyzed include Apple, Broadcom, Intel, Nvidia, TSMC, Toshiba, and more. The company predicts that much of the growth between now and 2026 will stem primarily from robotics and 5G. 

    Continue reading
  • Deepfake attacks can easily trick live facial recognition systems online
    Plus: Next PyTorch release will support Apple GPUs so devs can train neural networks on their own laptops

    In brief Miscreants can easily steal someone else's identity by tricking live facial recognition software using deepfakes, according to a new report.

    Sensity AI, a startup focused on tackling identity fraud, carried out a series of pretend attacks. Engineers scanned the image of someone from an ID card, and mapped their likeness onto another person's face. Sensity then tested whether they could breach live facial recognition systems by tricking them into believing the pretend attacker is a real user.

    So-called "liveness tests" try to authenticate identities in real-time, relying on images or video streams from cameras like face recognition used to unlock mobile phones, for example. Nine out of ten vendors failed Sensity's live deepfake attacks.

    Continue reading
  • Lonestar plans to put datacenters in the Moon's lava tubes
    How? Founder tells The Register 'Robots… lots of robots'

    Imagine a future where racks of computer servers hum quietly in darkness below the surface of the Moon.

    Here is where some of the most important data is stored, to be left untouched for as long as can be. The idea sounds like something from science-fiction, but one startup that recently emerged from stealth is trying to turn it into a reality. Lonestar Data Holdings has a unique mission unlike any other cloud provider: to build datacenters on the Moon backing up the world's data.

    "It's inconceivable to me that we are keeping our most precious assets, our knowledge and our data, on Earth, where we're setting off bombs and burning things," Christopher Stott, founder and CEO of Lonestar, told The Register. "We need to put our assets in place off our planet, where we can keep it safe."

    Continue reading

Biting the hand that feeds IT © 1998–2022