Aurora delays keep Frontier supercomputer in #1 spot on Top500

EuroHPC's LUMI upgrades secure narrow lead over all-new Leonardo system

SC22 Despite expectations that we might see the long-awaited Aurora supercomputer crest the Top500 list of the world’s most powerful systems, the US Department of Energy's 1.1 exaflop Frontier machine at Oak Ridge National Lab continues to hold the number one spot.

Frontier's lead in the biannual ranking of publicly known supercomputers is likely to remain unchallenged, perhaps into the June 2023 rankings when Argonne's Aurora supercomputer can actually get enough Xeons to complete the Linpack benchmark. Or, of course, if China puts its existing exascale systems to the test.

The US Frontier system, Japan's Fugaku, and EuroHPC's LUMI held steadfast as the top three spots.

Much like Frontier, LUMI is based on an all-AMD architecture built by Cray that pairs optimized 64-core Epyc 3 Milan CPUs and Instinct MI250x GPUs with HPE's Slingshot-11 NICs. Since its initial appearance as the third most powerful supercomputer on the Top500 in spring, the system has grown from 1.1 million cores to more than two million, demonstrating near-linear scaling at 309 petaflops of double-precision performance.

Leonardo rises

Without those upgrades, the LUMI system would have certainly fallen behind Italy's newly minted 174-petaflop Leonardo system, which now claims the title as the fourth most powerful supercomputer on the Top500 and the second most powerful system in Europe behind LUMI.

Based on Atos's BullSequana XH2000 platform, Leonardo pairs 1.4 million third-gen Intel Xeon Platinum cores with Nvidia's A100 40GB GPUs and 100Gbps InfiniBand NICs.

While no threat to Frontier, the Leonardo system still marks one of the best showings we've seen from Intel in recent years, even though delays with Sapphire Rapids have put the chip giant on the HPC naughty list. In the Top500 overall, Intel still provides three-quarters of the processors used by supercomputers on the list. However, the chipmaker is losing ground, most notably to AMD's Epyc processor line.

Case in point: of the 10 fastest systems on the Top500 ranking, Leonardo is one of only two using Intel processors. The other is China's 61-petaflop Tianhe-2 system, which uses Intel's nearly decade-old 12-core Xeon E5-2792v2 processors. AMD, by comparison, powers four of the top 10 systems today.

Beyond the addition of the Leonardo system, the list remains largely unchanged from the June/spring showing, at least as far as the top 10 supercomputers are concerned. You can check out the full list of systems on this fall's Top500 here.

Intel seeks redemption, Nvidia eyes a place in the top 5

Looking ahead to 2023, there are numerous systems to keep an eye out for, not least of which is Argonne National Laboratory's all-Intel Aurora supercomputer, which aims to double Frontier's performance to cross the two-exaflop barrier.

The machine has been delayed since 2018 by Intel's failure to ship chips on time. In September, Intel announced it had begun the initial shipments of its Sapphire Rapids CPU and Ponte Vecchio GPU blades to Argonne for integration into the system.

As of this month, it doesn't appear the system will be finished any time soon. Speaking to the press earlier last week, the head of Intel's Super Compute Group, Jeff McVeigh, said the system wouldn't make an appearance on this fall's Top500 ranking because work was ongoing.

Whether Aurora will be operational in time for next spring's International Supercomputing Conference in Hamburg in unclear. But with server and supercomputer manufacturers like HPE already announcing availability of Intel's next-gen parts, we're not counting out the possibility.

Another anticipated system is the successor to Nvidia's Selene supercomputer. Announced in March, Nvidia's in-house Eos supercomputer will mesh 18 DGX SuperPODs for a total of 576 DGX H100 nodes. When complete, the fabless chipmaker estimates Eos will be capable of 275 petaflops of FP64 performance. If this bears out in the Linpack benchmark, that would put the system within spitting distance of EuroHPC's LUMI.

However, it's unclear when we can expect to see the system come online. Like Argonne's Aurora, Nvidia's DGX H100 server platform also relies on Intel's 4th-gen Xeon Scalable processors, which have been delayed until early next year.

China's silence

One of the biggest unknowns remains China, which has taken a backseat in recent Top500 rankings when it comes to raw performance. While the country has 162 systems – 35 more than the US – on the Top500, just two rank in the top 10: the Sunway TaihuLight and Tianhe-2.

The thing is, we know they're holding back. China was recently confirmed to lead the US with at least two exascale-class supercomputers.

So why keep these systems out of the rankings? There are a couple of reasons. For one, it's important to remember that when the world's most powerful supercomputers aren't competing for a place on the Top500, many are put to work running war games and/or simulating nuclear weapons testing.

China relies heavily on US silicon and intellectual property – both chips and manufacturing kit – to design and build these systems. For this reason, the US has spent the past several years trying to limit China's access to key technologies including those required to build supercomputers that can compete with the West.

Earlier this year, the US barred the export of AMD Instinct and Nvidia's A100 and H100 GPUs to China and Russia because they exceeded newly imposed performance limits. This decision ultimately drove Nvidia to nerf its A100 so it could keep selling chips in China.

So while a return by China to the Top500 ranking would unquestionably bruise Western egos, it would also risk inviting stiffer sanctions on the region. ®

* The two updates of the list take place in June at the International Supercomputing Conference and in November at SC (the ACM/IEEE Supercomputing Conference), which is currently underway.

Similar topics

TIP US OFF

Send us news


Other stories you might like