Oracle's $40B Nvidia hardware haul may be too hot for OpenAI's Abilene, Texas DC to handle

400,000 GB200s could be more than a 1.2GW datacenter can chew

Oracle will reportedly shell out around $40 billion on Nvidia's most advanced GPUs to provide compute power to OpenAI from the first US Stargate datacenter in Abilene, Texas, assuming the site can deliver enough electricity to handle the load.

Citing sources familiar with the matter, The Financial Times reported Friday the massive investment will pay for around 400,000 Nvidia GB200 superchips.

These accelerators, announced at Nvidia's GTC event last March, feature a pair of its most powerful Blackwell GPUs along with its homegrown Grace CPU. Thirty-six of these superchips form an NVL72 system capable of churning out 1.4 exaFLOPS of sparse FP4 compute.

This indicates that Oracle will pack the 1.2 gigawatt facility built by Crusoe with around 11,000 of these rack systems, totaling nearly 16 zettaFLOPs of the lowest precision compute money can buy.

And with an estimated cost per rack of $3.5 million, the $40 billion price tag isn't too far off once you factor in all the trimmings necessary to get the liquid-cooled gear running.

But a little back-of-the-napkin math suggests that either those figures are off or the 1.2 gigawatt facility may not have enough power to run them all at the same time.

Each NVL72 rack is rated for a peak power draw of 120 kilowatts, and that's before you factor in power and cooling related losses. By our estimate, you'd need 1.45 gigawatts of power, assuming a power usage effectiveness (PUE) of 1.1 to harness their full potential.

And that's when the datacenter campus is complete. Only about 200MW of datacenter capacity will be ready this year. By our estimate, that's enough for about 1,500 NVL72 racks or about 54,000 GB200 superchips.

The remaining gigawatt of datacenter capacity is expected to come online sometime in 2026, with Oracle set to lease the site for 15 years.

While power could prove problematic, Oracle and datacenter operator Crusoe could still make it work. Just because each of these rack-scale systems can draw 120kW doesn't mean they always will. Some degree of over-provisioning is to be expected. With the Abilene campus spanning eight buildings, it's unlikely that Oracle will attempt to wrangle all 400,000-some superchips into a single training cluster.

We anticipate a decent chunk of them will be tasked with other workloads like inference, synthetic data generation, reinforcement learning, and other workloads unlikely to push the systems to their limits.

How OpenAI's Abilene supercluster stacks up

Assuming Oracle and Crusoe can overcome these power constraints, the Abilene datacenter campus will be among the most powerful AI supercomputers in the United States, offering 10-20x more compute capacity than rival Elon Musk's 200,000 GPU Colossus supercomputer.

Located in Memphis, Tennessee, the system is based on Nvidia's H100 and H200 GPUs and boasts nearly 800 exaFLOPS of sparse FP8 compute powered by a pair of 150MW substations and another 150MW of Tesla battery backups.

Or at least that's the plan. So far, only one of the two substations has been completed. In the meantime, additional capacity is being supplied by more than a dozen smog-belching gas turbines. So, OpenAI and Oracle certainly wouldn't be the first to run up against power constraints.

With that said, OpenAI's Abilene model factory isn't even close to the biggest AI datacenter project announced so far.

Back in December, Facebook parent Meta drew up plans for a 2.2 gigawatt datacenter campus in Richland Parish, Louisiana, which will be built out in stages over the next five years or so.

Meta hasn't said how many GPUs it plans to pack the facility with, but has said that it expects to have 1.3 million accelerators across its datacenter footprint by year's end.

Stargate goes international

Friday's report comes just days after OpenAI took its $500 billion Stargate scheme international. Described as the AI giant's moonshot, Stargate is a global joint venture that aims to build exascale-class infrastructure and secure strategic compute independence.

Working with Oracle, Nvidia, Cisco, SoftBank, and G42 Cloud, OpenAI plans to bring additional gigawatt of compute capacity online in the United Arab Emirates (UAE). The first phase of the project is expected to total 200MW and come online in 2026.

The Register reached out to Oracle and OpenAI for comment, but hadn't heard back at the time of publication. ®

More about

TIP US OFF

Send us news


Other stories you might like