HPC

China takes HPC heavyweight title

GPUs, Arch interconnect knocks out Jaguar and Roadrunner


If it wasn't immediately obvious that China is a superpower, today's announcement that the Tianhe-1A CPU-GPU hybrid is the most powerful supercomputer in the world - and by a comfortable margin - will make it abundantly clear.

China wants to move from being a manufacturing powerhouse to being a full player in the 21st century technological economy, and it is making the investments to transform itself.

The National Supercomputer Center in Tianjin, China, this morning rolled out the Top 100 rankings of the country's fastest supercomputers (based on the Linpack Fortran benchmark test, like the global Top 500 list). The Tianhe-1A (which is translated from Chinese for "River in the Sky" or "Milky Way" with a model number slapped on it) beat out all of its rivals. The supercomputer is based on a rack server design created by the National University of Defense Technology (NUDT), and comprises 14,336 Xeon processors and 7,168 of Nvidia's Tesla M2050 fanless GPU co-processors.

The resulting machine has a peak theoretical performance of 4.7 petaflops, which is a gargantuan amount of raw performance, but where the rubber hits the road on the Linpack test, the machine delivers 2.51 petaflops.

That means 47 per cent of the theoretical performance of the machine is going up the chimney. This is not particularly good. But with CPU-GPU clusters costing roughly about a quarter of the cost of CPU clusters, according to Sumit Gupta, product marketing manager for the Tesla product line, on teraflops-for-teraflops basis, the inefficiency can be tolerated to make up for scalability. For now, at least.

Coders and hardware engineers the world over will now be trying to boost efficiencies on the PCI-Express bus, on the system interconnects, and in the software stack to get the sustained performance a lot closer to the peak for ceepie-geepie hybrid machines. Gupta says that the GPUs are responsible for around 70 per cent of the calculations that were done on the Linpack test.

Like the USS Enterprise, the Tianhe-1A, as the name suggests, is not the first hybrid parallel super that China has put into the field. The Tianhe-1 cluster, based on Intel Xeon chips and Advanced Micro Devices Radeon HD 4870 GPUs, broke onto the Top 500 list in November 2009. That machine had 71,680 cores and had a peak theoretical performance of 1.2 petaflops and a sustained performance of 563.1 teraflops. In that case, 53 per cent of the aggregate performance went up the chimney.

China's Tianahe-1A Supercomputer

The Tianhe-1A CPU-GPU hybrid super

The Tianhe-1A super is not important just because it is now the fastest supercomputer in the world, but because NUDT has spent years developing its own proprietary interconnect for the server nodes. And as El Reg previously reported, a future generation of Tianhe machines will use a homegrown multi-core processor, called Godson and based on the MIPS core. (So when does China's Institute of Computing Technology, part of the Chinese Academy of Sciences, start making its own GPUs?)

According to sources at Nvidia, which had people on the floor at the unveiling of Tianhe-1A in China this morning, the proprietary interconnect is called Arch and it links the server nodes together using optical-electric cables in a hybrid fat tree configuration. The switch at the heart of Arch has a bi-directional bandwidth of 160 Gb/sec, a latency for a node hop of 1.57 microseconds, and an aggregate bandwidth of more than 61 Tb/sec.

Some people have been suggesting that this interconnect somehow links the GPUs to the CPUs, but I am fairly certain that the GPUs hook to the CPUs by the plain old PCI-Express 2.0 bus in the server nodes. It would be very interesting if this interconnect has something akin to Remote Direct Memory Access, which allows a node to reach into and directly talk over the PCI-Express bus to the memory in a GPU in another node. Nvidia didn't mention this, and no one else has either, but that could significantly speed up performance if the Arch switch has such a feature.

The Tianhe-1A super has an aggregate of 262 TB of main memory and 2 PB of storage implemented as a Lustre clustered file system. The machine is comprised of 112 compute racks, eight storage node cabinets, six communications racks, and 14 I/O racks.

I personally welcome our Chinese HPC overlords. It's hard not to when my government owes their government $2 trillion, right? ®

Similar topics


Other stories you might like

  • Uncle Sam to clip wings of Pegasus-like spyware – sorry, 'intrusion software' – with proposed export controls

    Surveillance tech faces trade limits as America syncs policy with treaty obligations

    More than six years after proposing export restrictions on "intrusion software," the US Commerce Department's Bureau of Industry and Security (BIS) has formulated a rule that it believes balances the latitude required to investigate cyber threats with the need to limit dangerous code.

    The BIS on Wednesday announced an interim final rule that defines when an export license will be required to distribute what is basically commercial spyware, in order to align US policy with the 1996 Wassenaar Arrangement, an international arms control regime.

    The rule [PDF] – which spans 65 pages – aims to prevent the distribution of surveillance tools, like NSO Group's Pegasus, to countries subject to arms controls, like China and Russia, while allowing legitimate security research and transactions to continue. Made available for public comment over the next 45 days, the rule is scheduled to be finalized in 90 days.

    Continue reading
  • Global IT spending to hit $4.5 trillion in 2022, says Gartner

    The future's bright, and expensive

    Corporate technology soothsayer Gartner is forecasting worldwide IT spending will hit $4.5tr in 2022, up 5.5 per cent from 2021.

    The strongest growth is set to come from enterprise software, which the analyst firm expects to increase by 11.5 per cent in 2022 to reach a global spending level of £670bn. Growth has fallen slightly, though. In 2021 it was 13.6 per cent for this market segment. The increase was driven by infrastructure software spending, which outpaced application software spending.

    The largest chunk of IT spending is set to remain communication services, which will reach £1.48tr next year, after modest growth of 2.1 per cent. The next largest category is IT services, which is set to grow by 8.9 per cent to reach $1.29tr over the next year, according to the analysts.

    Continue reading
  • Memory maker Micron moots $150bn mega manufacturing moneybag

    AI and 5G to fuel demand for new plants and R&D

    Chip giant Micron has announced a $150bn global investment plan designed to support manufacturing and research over the next decade.

    The memory maker said it would include expansion of its fabrication facilities to help meet demand.

    As well as chip shortages due to COVID-19 disruption, the $21bn-revenue company said it wanted to take advantage of the fact memory and storage accounts for around 30 per cent of the global semiconductor industry today.

    Continue reading

Biting the hand that feeds IT © 1998–2021