HPC

Supercomputers take efficiency up another notch

RISC cores and hybrids deliver most flops per watt


SC10 More than the hand-wringing over parallel computing, the mounting electricity bill is the limiting factor holding back the growth of petascale systems. The recurring joke at the SC10 supercomputing conference last week was that we cannot build exascale systems that require their own nuclear power plant to juice them up. And that means supercomputer system designers have to figure out how to do more flopping with fewer electrons.

The Green500 ranking of supercomputers is a twist on the Top 500 ranking, but instead of ranking the machines by their sustained performance on the Linpack Fortran matrix math benchmark, the machines are arranged by their energy efficiency (in terms of megaflops per watt) as they run that benchmark test. The November 2010 Green500 list, which you can see here, is not just a re-sort of the Top 500 list. Some machines that are very energy efficient are nonetheless small in terms of their teraflops and thus do not make the Top 500 ranking. Similarly, some relatively large and famous supers are so awful in terms of how much juice they use that they don't make the Green500 list. But generally speaking, there is a fair amount of overlap.

The Top 500 list is compiled by Erich Strohmaier and Horst Simon of the Lawrence Berkeley National Laboratory, Jack Dongarra of the University of Tennessee, and Hans Meuer of the University of Manheim. We told you all about the latest Top 500 last week. The Green500 list, created by Wu-chun Feng and Kirk Cameron of Virginia Tech, has only been published eight times (two or three times a year, generally) since November 2007; the Top 500 list comes out twice a year, but has been published for the past 18 years.

The greenest super on the November 2010 Green500 rankings is the prototype of the BlueGene/Q that Big Blue is building for Lawrence Livermore National Laboratory for installation in 2012. (See the drilldown into the BlueGene/Q from El Reg for details on this super's design.)

The half-rack of BlueGene/Q super, which is currently running at IBM's Watson Research Center in New York, ranked 115th on the Top 500 list at 65.35 teraflops (with 8,192 Power cores running 1.6 GHz) and burned 38.8 kilowatts of juice. That works out to 1,684 megaflops per watt, giving the BlueGene/Q a considerable lead in terms of energy efficiency. You can see why LLNL is planning on ramping this machine up to 20 petaflops in 2012. If the efficiency holds, that 20 petaflops box with nearly 1.6 million cores will burn 7.4 megawatts of juice.

By way of comparison, the top-ranked Tianhe-1A super located at the National Supercomputing Center in Tianjin, China, has a sustained performance of 2.57 petaflops and burns just over 4 teraflops of juice with its hybrid Intel Xeon-Nvidia GPU architecture, for a 635.2 megaflops per watt ranking on the Green500 list. (That's the 11th most power-efficient super on the list).

Here's another interesting comparison. The Jaguar Cray XT5 super at the US Department of Energy's Oak Ridge National Laboratory, which is sorely in need of an upgrade, burns just under 7 megawatts to deliver its 1.76 petaflops using 224,162 Opteron cores, which works out to only 253.1 megaflops per watt but only an 81st ranking on the Green500 list. Cray XT5 customers can at double their power efficiency by moving to twelve-core Opteron 6100 processors, and maybe more if they move to XE6 frames with the faster "Gemini" XE interconnect. (It depends on the workload.) Even moving to sixteen-core Opterons in 2011 will only boost performance by around 33 per cent and probably flops per watt by around the same amount, which will only get a future XE6 machine to somewhere around 675 megaflops per watt. (That is assuming a constant clock speed for the Opteron processors and increasing core counts in the processor sockets.) You can see why Cray is trying to figure out how to interface GPU co-processors into its system design. Increasing x64 core counts are not going to do the energy efficiency trick.

The Tsubame 2 ceepie-geepie cluster at the Tokyo Institute of Technology in Japan was the third most powerful machine on the Top 500 list, at 1.19 petaflops on the Linpack test; that machine lashes three Nvidia M2050 GPU co-processors to every server node, and the GPUs are doing approximately 70 per cent of the Linpack calculations.

Feng and Cameron estimate that the Tsubame 2 cluster burns 1.24 megawatts, and when you do the math, it delivers 958.4 megaflops per watt. Even if you could get a ceepie-geepie cluster to offer the 80-ish percent efficiency of a traditional CPU cluster using a fast interconnect, the BlueGene/Q would still beat out Tianhe-1A and Tsubame 2 in terms of flops per watt. And that BlueGene/Q machine has similarly awful efficiency, with about half its flops going up the chimney right now, so there is plenty of room for IBM to do some tuning to get even better efficiency with BlueGene/Q before and after it is delivered to LLNL a little more than a year from now.

Similar topics

Broader topics

Narrower topics


Other stories you might like

  • World’s smallest remote-controlled robots are smaller than a flea
    So small, you can't feel it crawl

    Video Robot boffins have revealed they've created a half-millimeter wide remote-controlled walking robot that resembles a crab, and hope it will one day perform tasks in tiny crevices.

    In a paper published in the journal Science Robotics , the boffins said they had in mind applications like minimally invasive surgery or manipulation of cells or tissue in biological research.

    With a round tick-like body and 10 protruding legs, the smaller-than-a-flea robot crab can bend, twist, crawl, walk, turn and even jump. The machines can move at an average speed of half their body length per second - a huge challenge at such a small scale, said the boffins.

    Continue reading
  • IBM-powered Mayflower robo-ship once again tries to cross Atlantic
    Whaddayaknow? It's made it more than halfway to America

    The autonomous Mayflower ship is making another attempt at a transatlantic journey from the UK to the US, after engineers hauled the vessel to port and fixed a technical glitch. 

    Built by ProMare, a non-profit organization focused on marine research, and IBM, the Mayflower set sail on April 28, beginning its over 3,000-mile voyage across the Atlantic Ocean. But after less than two weeks, the crewless ship broke down and was brought back to port in Horta in the Azores, 850 miles off the coast of Portugal, for engineers to inspect.

    With no humans onboard, the Mayflower Autonomous Ship (MAS) can only rely on its numerous cameras, sensors, equipment controllers, and various bits of hardware running machine-learning algorithms to survive. The computer-vision software helps it navigate through choppy waters and avoid objects that may be in its path.

    Continue reading
  • Revealed: The semi-secret list of techs Beijing really really wishes it didn't have to import
    I think we can all agree that China is not alone in wishing it had an alternative to Microsoft Windows

    China has identified "chokepoints" that leave it dependent on foreign countries for key technologies, and the US-based Center for Security and Emerging Technology (CSET) claims to have translated and published key document that name the technologies about which Beijing is most worried.

    CSET considered 35 articles published in Science and Technology Daily from April until July 2018. Each story detailed a different “chokepoint” or tech import dependency that China faces. The pieces are complete with insights from Chinese academics, industry insiders and other experts.

    CSET said the items, which offer a rare admission of economic and technological vulnerability , have hitherto “largely unnoticed in the non-Chinese speaking world.”

    Continue reading

Biting the hand that feeds IT © 1998–2022