18 zettaFLOPS of new AI compute coming online from Oracle late next year

New clusters to feature 800,000 Nvidia Blackwell and 50,000 AMD Instinct MI450X GPUs

Oracle on Tuesday revealed it would field more than 18 zettaFLOPS worth of AI infrastructure from Nvidia and AMD by the second half of next year.

This includes a cluster of 800,000 Nvidia GPUs capable of delivering up to 16 zettaFLOPS of peak AI performance — that's sparse FP4 in case you're wondering. 

The cluster, part of Oracle Cloud Infrastructure's Zettascale10 offering, is a big win for Nvidia, which isn't only furnishing the GPUs and rack systems, but also the networking. Stitching the GPUs together will be Nvidia's Spectrum-X Ethernet switching platform, marking the latest large scale cluster built around the platform. If that weren't enough, Oracle also plans to offer a slew of Nvidia AI services through its cloud platform.

While Nvidia counts its billions, AMD expects to see 50,000 of its MI450X-series accelerators deployed at Oracle data centers in the second half of next year, with additional deployments expected the following year.

First teased at AMD's advancing AI event in June, the MI450X will be offered in a rack scale architecture similar to Nvidia's NVL72 that it calls Helios.

Each rack is equipped with 72 MI450X GPUs stitched together using an open alternative to Nvidia's high-speed NVLink interconnect called Ultra Accelerator Link (UALink). At OCP, we caught our first glimpse of what production Helios racks based on the new open rack wide (ORW) form factor will end up looking like.

Here's a look at what AMD's Helios rack might look like in the wild with its new open rack wide (ORW) form factor

Here's a look at what AMD's Helios rack might look like in the wild with its new open rack wide (ORW) form factor - Click to enlarge

In case you're wondering, the double-wide system is technically one rack as defined by the OCP spec.

AMD predicts that a single Helios rack will deliver 2.9 exaFLOPS of FP4 and up to 1.4 exaFLOPS of FP8 performance, along with 31 TB of HBM4 memory good for 1.4 petabytes a second of bandwidth. It's not clear at this point whether that's dense or sparse FLOPS, but if we had to guess, it's probably the latter. This puts it in the same performance class as Nvidia's upcoming Vera Rubin NVL144 systems, albeit with a boatload more HBM.

That puts OCI's initial deployment of 50,000 MI450Xs at just over two zettaFLOPS of ultra-low precision compute.

While it's fun to throw around the word zettaFLOPS, few customers will actually be able to harness all the compute Oracle is laying down. Not only would they have to lock in an entire cluster, but also FP4 is generally regarded as a storage format for AI inference, with model builders, like OpenAI, only now warming to the format.

For the kinds of training jobs customers are likely to rent out a cluster of 50 thousand or more GPUs for, higher precision data types like BF16, FP8, have historically been preferred. That's not to say it's impossible to train a model natively at FP4. Nvidia recently published a paper exploring the merits of pretraining using 4-bit microscaling datatypes like NVFP4, and early findings suggest that the datatype can achieve quality levels comparable to FP8.

One company likely to end up getting access to a substantial quantity of Oracle's GPU hoard is OpenAI. Both Nvidia and AMD recently signed investment deals with the AI flag bearer, predicated on large scale deployments of their accelerators by OpenAI's partners – and Oracle happens to be its biggest.

And while AMD's datacenter GPU market share is still dwarfed by Nvidia's, that's likely to change. Under a recently announced agreement, OpenAI will have the opportunity to acquire 160 million shares of the chipmaker at a penny a pop if the House of Zen can facilitate the deployment of six gigawatts of instinct accelerators.

The 50,000 MI450X cluster announced by Oracle this week appears to be the first piece of an initial gigawatt-scale deployment. By our estimate, this suggests that Oracle could end up deploying another 180,000 or so MI450s as part of the deal. ®

More about

TIP US OFF

Send us news


Other stories you might like