Top 500 supers: China rides GPUs to world domination
The People's Republic of Petaflops
The "Jaguar" XT5 system at the US Department of Energy's Oak Ridge National Laboratory was knocked out of the top spot by Tianhe-1A, which is what happens when a cat stands still in the GPU era of HPC. The Jaguar machine has 224,162 Opteron cores spinning at 2.6 GHz and delivers 1.76 petaflops of performance on the Linpack test. This Cray machine links Opteron blade servers using its SeaStar2+ interconnect, which has been superseded by the new "Gemini" XE interconnect in the XE6 supers that started rolling out this summer.
If Oak Ridge moved to twelve-core Opteron 6100 processors and the XE6 interconnect, it could have doubled the performance of Jaguar and held into the Top 500 heavyweight title. One other thing to note: The Jaguar machine is 75.5 per cent efficient on the Linpack benchmark, a lot better than the Tianhe-1A ceepie-geepie.
The "Nebulae" ceepie-geepie built from six-core Intel Xeon 5650 processors and Nvidia M2050 GPUs that made its debut on the June 2010 Top 500 list got knocked down from number 2 to number 3 on the list. The Nebulae machine, which is a blade server design from Chinese server maker Dawning, is installed at the National Supercomputing Center in Shenzhen. It is rated at 1.27 sustained petaflops at 43 per cent efficiency against peak theoretical performance.
Number four on the list is also a ceepie-geepie, it is the upgraded Tsubame 2 machine at the Tokyo Institute of Technology. (That's shortened to TiTech rather than TIT, which would be where you'd expect a machine called Milky Way to be located. But we digress). The Tsubame 2 machine is built from Hewlett-Packard's SL390s G7 cookie sheet servers, which made their debut in early October. TiTech announced the Tsubame 2 deal back in May, and this machine includes over 1,400 of these HP servers, each with three M2050 GPUs from Nvidia.
The Tsubame 2 machine has 73,278 cores and is rated at 2.29 peak petaflops and delivered 1.19 petaflops of sustained performance on the Linpack test. That's a 52 percent efficiency, about what the other ceepie-geepies are getting. By the way, the prior Tsubame 1 machine was based on x64 servers from Sun Microsystems, with floating point accelerators from Clearspeed in only some of the nodes. And one more thing: Tsubame 2 runs both Linux and Windows, and according to the Top 500 rankers, both operating systems offer nearly equivalent performance.