This article is more than 1 year old
China looks set to pip Uncle Sam at the post in exascale computer race
Tianhe-3 rig a year in front of stateside supers
Analysis China is to take a clear lead in the exascale superdupercomputer race - its Tianhe-3 system looks to be a whole year ahead of the US's best efforts.
Aurora, Frontier and El Capitan, the three US exascale supercomputers, are set to be eclipsed by China's Tianhe-3 1 exaFLOPS system which is prototyping this year with delivery expected in 2020. The first US exaflopper, the Cray/Intel Aurora A21, will debut in 2021.
It takes much time, a lot of technology effort and big bucks to build a computer capable of an exaFLOPS – a billion billion floating point operations a second.
Scaling current supercomputer architecture runs into problems. The cancelled Cray/Intel Aurora 1 was going to have 50,000 nodes running Intel's Knights Hill gen 3 Phi processors with a 200Gbit/s Omni-Path interconnect. It would have had a 3.6TF node performance, a 180PF system performance, and consume 13MW.
Scaling that up to 1EF, 1,000 petaFLOPS, would involve 250,000 nodes and unbalanced system interconnect performance. It would consume more than 65MW. This is an unrealistic approach.
The second-generation Aurora, the A21 system, will use a new x86-ancillary processor architecture, not involving the now-cancelled Knights Hill Phi processor. Intel says the architecture exists in a prototype form, sort-of, and is not completely novel. The Omni-Path interconnect may now be 400Gbit/s, and the power envelope is 20-40MW.
Aurora A21 is scheduled for a 2021 delivery. Its companion Frontier system has an unknown architecture, no identified hardware suppliers and a 2021 delivery date. The El Capitan System is in the same fluid state.
China's Tianhe-3 should blink into life in 2020, two years away. Can they hit that schedule?
What we know about Tianhe-3 is its name, its overall schedule, some funding data, its precursor systems, and not a lot besides.
Precursors
There are three precursor systems, Tianhe-1, Tianhe-2 and Sunway TaihuLight:
Tianhe-1 | Tianhe-2 | Sunway TaihuLight | |
---|---|---|---|
Performance | 2.5 PF | 34 PF | 93 PF |
Processor Notes | 4,096 Xeon + AMD GPUs | 32,000 Xeon Ivy Bridge & Phi | 40,960 SW26010 |
Cores | 186,368 | 3,120,000 | 10,649,600 |
Nodes | ? | 16,000 | 40,960 |
OS | Linux | Kylin Linux | Sunway RaiseOS 2.0.5, (Linux base) |
Operational Date | Late 2010 | 2013 | 2016 |
Host Site | National Computing Centre, Tianjin | National Computing Centre, Guangzhou | National Computing Centre, Wuxi |
Interconnect | NUDT Arch 160Gbit/s | NUDT TH Express-2 Fat Tree topology with 13 576-port switches. Opti-electronic | 5-level integrated hierarchy custom-designed interconnect |
Tianhe-2 had more than ten times the power of Tianhe-1. The next step up was threefold to Sunway TaihuLight using Chinese-designed processors.
Tianhe-2 (PDF) was the world's fastest supercomputer until surpassed by Sunway TaihuLight in 2016.
Tianhe-2 was a 17.6MW system, 24MW with cooling. Sunway TaihuLight uses 15MW.
What's apparent is China's development of its own technology, particularly the interconnects and then the processors. Sunway TaihuLIght has 40,960 Chinese-designed SW26010 manycore 64-bit RISC processors.
So we can expect Tianhe-3 to continue this trend.
Tianhe-3
The system will be hosted by the National Computer Centre at Tianjin. A prototype will become operational this year and deliver 4 to 5 PF, suggesting it could be a single node or a few. The completed system's power draw will be 30-40MW and the application areas include big data, AI, healthcare, and smart cities.
We don't know its processor architecture, its interconnect technology, its core count, its node count nor the OS.
El Reg can surmise it will run a development of the existing Linux-based Sunway RaiseOS and have another custom-designed interconnect system. A straight scaling of the TaihuLight's CPUs would involve around 500,000 of the Chinese-designed SW26010 manycore 64-bit RISC processors.
We can assume the actual CPU will be a more powerful one, possibly with added ancillary processors, vector units, caches, and system-on-chip technology to speed things along.
If the Chinese get Tianhe-3 operational in 2020, more details will be released then.
Will they make their 2020 target? It is a state-sponsored effort, their moonshot. China has the three precursor systems, and have a track record in designing their own processing and interconnect technology. Their chances look promising.
Japan has its own exascale computer effort, the NEC Post-K system using 64-bit Arm-based processors, and Europe has an EU exascaler, possibly a derivative of the Bull Sequana 25PF system, with a suggested 2023 release date.
It's China and the US leading the way, with Uncle Sam possibly developing a more user-friendly software environment. But that counts for nothing unless the big exaflopper iron arrives on time and isn't just a, er, flop. ®