Aurora breaks the exaFLOPS barrier but falls short of the final Frontier once again

With LLNL's AMD-powered El Capitan on the horizon, time is running out for Intel's Aurora to claim number 1 spot

ISC Argonne National Laboratory's Aurora supercomputer has officially breached the exaFLOPS barrier, but, once again, it's fallen short of unseating Oak Ridge's Frontier system for the number one spot on this spring's Top500.

With Lawrence Livermore's (LLNL) El Capitan supercomputer expected to make its debut on the Top500 ranking as early as this fall, it seems the much touted and even longer delayed Aurora system may never claim the title.

Powered by 21,248 of Intel's high-bandwidth memory (HBM)-equipped Xeon Max processors and 63,744 GPU Max accelerators, Aurora was widely anticipated to become the most powerful supercomputer America has ever built when it was delivered to the Department of Energy's (DoE) Argonne National Laboratory last year.

Unfortunately, when last November's Top500 rolled around, Argonne had only managed to get Linpack running on about half the system. Even half baked, the machine still spit out an impressive 585 petaFLOPS of double precision performance. With this spring's ranking, we're beginning to see what the system is really capable of.

With what appears to be the full system, Argonne extracted just over an exaFLOPS of performance this time around, officially making it the second exascale machine to grace the Top500 ranking of publicly known supercomputers.

Of course, when it comes to exascale compute, it's well known that China has several supers capable of an exaFLOPS or more operating in the shadows. And with US-China trade relations continuing to deteriorate, particularly in the fields of HPC, AI, and semiconductor manufacturing, it's unlikely the Middle Kingdom will spill the beans on its exascale system's anytime soon.

One area where Intel's Aurora system falls squarely behind is power consumption. Despite using a much more modern architecture, the machine is far from the most efficient. Breaking the exaFLOPS barrier required a whopping 38.6 MW of power. For reference, Frontier managed to squeeze 1.2 exaFLOPS from just 22.7 MW.

nvidia gtc 2021 grace cpu hopper gpu

Green500 shows Nvidia's Grace-Hopper superchip is a power-efficiency beast

DON'T MISS

While still no match for Frontier, it appears that Argonne has plenty of room for improvement. As things stand, the lab has only managed to tap just over half of the machine's 1.98 exaFLOPS of theoretical peak performance.

After publication we learned that Aurora's 1.01 exaFLOPS score was achieved using 87 percent of the machine active.

Even if Aurora eventually manages to outperform Frontier, there's a far larger and more capable system just on the horizon. LLNL's El Capitan supercomputer will be among the first systems powered by AMD's MI300A APUs.

We looked at these chips in detail during AMD's launch event in December, but, in a nutshell, they combine three Zen 4 compute dies totaling 24 cores with six CDNA 3 GPU dies into a single socket. The GPUs and CPUs function as a single unit sharing up to 128GB of speedy HBM3 memory.

El Capitan is expected to have a peak performance of 2.3 exaFLOPS, nearly 400 petaFLOPs greater than Aurora — remember, that's theoretical, not real world performance — making it roughly one Fugaku faster on paper.

Of course, as has been made plain by Aurora's Linpack benchmarks, actually harnessing all of that compute can be quite tricky at scale. So, perhaps Aurora can still pull off an 11th hour win.

Alps arrives while Sierra slips from top 10

While the 10 most powerful systems on the list are largely unchanged from November, with Eagle, Fugaku, and Lumi holding onto the third, fourth, and fifth places respectively, we find that Switzerland's Alps supercomputer has ousted Leonardo for the number six spot.

With a Linpack score of 270 petaFLOPS, Alps is the most powerful system on the Top500 to use Nvidia's Grace-Hopper Superchips. Teased at GTC in 2022, Nvidia's GH200 began making its way into customers hands earlier this year and combines a 72-core Arm processor with 480GB of LPDDR5x memory with an H100 GPU and between 96GB and 144GB of HBM3 or HBM3e memory.

Behind Alps, the Leonardo system is still holding on strong. However, in eighth place, we see that Spain's MareNostrum 5 ACC super has leapfrogged the valiant Summit supercomputer, picking up an additional 38 petaFLOPS since last fall for a total of 175 in the Linpack bench. This entry is especially curious, as despite its higher score, the machine actually appears to have shrunk slightly since last year with 17,920 fewer cores recorded this time around.

Coming in ninth and tenth place is Oak Ridge National Laboratory's venerable Summit and Nvidia's Eos supercomputers — not the 10k GPU version mind you, that's a different machine.

The addition of the Alps system means that LLNL's Sierra super has officially been pushed out of the top 10. Powered by IBM's Power 9 processors and Nvidia's now ancient V100 GPUs, the system managed to hold its own in the top 10 for six years.

An eventful year ahead

While El Capitan may turn out to be the machine to beat in 2024, the Top500 may be in for another shakeup with several high-profile systems expected to come online later this year.

Among the largest will be the Jupiter system, Europe's first exascale supercomputer. It's not clear whether the machine will be ready in time for Supercomputing in November, but with 24,000 GH200 Superchips backed by SiPearl's Arm-based Rhea processors, Jupiter will reportedly exceed an exaFLOPS of performance in real-world HPC workloads.

Then there's the UK's Dawn and Isambard-AI systems. When complete, Dawn, which is based on a similar design to Aurora, will reportedly boast more than 10,000 GPUs with a theoretical peak performance of 532 petaFLOPs. The University of Bristol's Isambard-AI, meanwhile, is expected to top 200 petaFLOPS of peak FP64 perf.

There's also a good chance we'll see more cloud-based systems, like Microsoft's Eagle, find their way onto the Top500 ranking. With GPU bit barns, cloud providers, and hyperscalers deploying tens of thousands of GPUs for AI — Meta plans to deploy 350,000 H100s this year — there's little doubt somebody will find time to run Linpack on them at least once. ®

More about

TIP US OFF

Send us news


Other stories you might like