This article is more than 1 year old

Exit the dragon: US govt blows $325m on China-beating 300PFLOPS monster computer

Nvidia, IBM to kit out labs with world's fastest super

Vid The US government has chosen IBM and Nvidia chips to build the world's fastest supercomputer – a 300 petaFLOPS beast that would trounce today's most powerful super: China's Tianhe-2.

The Department of Energy has commissioned two supercomputers, it was revealed on Friday: one is a system codenamed "Summit", which will be installed at the Oak Ridge National Laboratory in Tennessee by 2017. It is designed to peak at 150 to 300 petaFLOPS – that's 300 quadrillion calculations per second, or about five times faster than the 55 PFLOPS Tianhe-2.

The other system is codenamed "Sierra", which is designed to peak at more than 100 petaFLOPS. This will be based at the Lawrence Livermore National Laboratory in California.

Together, the systems will cost $325m to build. The DoE has set aside a further $100m to develop "extreme scale supercomputing", or an exaFLOP machine in other words.

The US's fastest publicly known supercomputer is the Cray-built 27 petaFLOPS Titan at Oak Ridge, which is number two in the world rankings. Number three is the 20,000 teraFLOPS (20 PFLOPS) Sequoia at the Lawrence Livermore. These speeds, by the way, are calculated from LINPACK benchmarks, and are the theoretical peak performance of the supers.

We're told the new machines, Summit and Sierra, will be used to model and predict chemical reactions, particularly bio-fuels, at the molecular level. The machines are also good for studying the effects of drugs on people, changes to Earth's climate, and nuclear safety, among other uses, it's understood. Boffins at Oak Ridge have been using supercomputers to model fusion reactors, as this video explains:

Youtube video

Both Summit and Sierra supers will use IBM's next-gen POWER9 processors (mentioned here) and Nvidia's Volta GPUs – which are two generations ahead of what's available today, and stack memory over the cores to maximize throughput.

Kepler and Maxwell are Nvidia's latest generation architectures, Pascal is due in 2016, and then it's Volta's time to shine. It's understood 90 per cent of Summit and Sierra's floating-point calculations will be crunched using the Volta GPUs.

The POWER9 Volta processors are tightly coupled in each server node using Nvidia's NVLink interconnect, which apparently shifts between 80 and 200GBps, way faster than the 31.5GBps a 16-lane PCIe v4 bus can manage.

That NVLink has helped shrink the next-gen supercomputers: according to Nvidia, Summit will have 3,400 POWER9-Volta server nodes – a fifth of the number of nodes in the AMD Opteron-Nvidia K20x Titan. And Summit is expected to consume only 10 per cent more power than the 8,200 kW Titan despite dominating it in processing performance, we're told.

Each of the fat Summit nodes will, according to the blueprints, churn through about 44 trillion calculations a second. The exact number of CPU and GPU cores is not known as IBM and Nvidia have insisted on keeping the POWER9 and Volta specs under wraps. Networking tech from Mellanox will be used to lash together nodes.

According to a fact sheet from the Oak Ridge lab:

Each [Summit] node will contain multiple IBM POWER9 CPUs and Nvidia Volta GPUs all connected together with NVIDIA’s high-speed NVLink. Each node will have over half a terabyte of coherent memory (high bandwidth memory plus DDR4) addressable by all CPUs and GPUs, plus 800GB of non-volatile RAM that can be used as a burst buffer or as extended memory. To provide a high rate of I/O throughput, the nodes will be connected in a non-blocking fat-tree using a dual-rail Mellanox EDR InfiniBand interconnect.

Summit and Sierra are expected to run Linux, and scientists will be to use familiar high-level programming languages – C, C++ and Fortran – along with OpenMP and MPI to develop highly parallelized applications for the machines. That the POWER9 and GPU cores will be interfaced using NVLink is abstracted away.

"The world's top super computers have a CPU-GPU architecture. It's a must," Sumit Gupta, Nvidia's general manager of accelerated computing, told The Register.

"If you went with pure compute, at 150 petaFLOPS your supercomputer would need half the power of Las Vegas to run it. It's just not feasible to operate a supercomputer that size without GPUs."

Power consumption is an interesting point given that Nvidia is chasing the battery-constrained mobile market with GPU cores suitable for phones and tablets.

"Energy efficiency techniques learned in mobile GPU design have permeated through Nvidia. We're hitting the same power walls in all our products. When we’re more energy efficient, we can run the core clocks higher and get better performance," said Gupta, before quipping:

"Supercomputers like Titan are just as energy constrained as a mobile phone but the battery is much bigger, if you like. For my phone, I'm thinking about the number of hours between charges. For a supercomputer, I'll be thinking about the dollar value of that power.

"Power efficiency is at the top of the mind for the scientists we've spoken to. It's not feasible to build a 20MW computer."

US Secretary of Energy Ernest Moniz added in a canned statement on Friday: “High-performance computing is an essential component of the science and technology portfolio required to maintain U.S. competitiveness and ensure our economic and national security." ®

More about


Send us news

Other stories you might like