Nvidia CEO promises sustainability salvation in the cult of accelerated computing
Not quite as dramatic as AMD's Lisa Su and her visions of nuclear-powered supercomputers
GTC On the surface, Nvidia's spring GPU Technology Conference was once again opened with a keynote dominated by generative AI technologies.
That shouldn't come as a surprise. Despite the progress made by the likes of AMD, Intel, and others, Nvidia remains the largest supplier of GPUs and accelerators used for machine-learning workloads.
But behind all of the pomp afforded to Nvidia's latest AI models, acceleration frameworks, and hardware — of which there's plenty to talk about — lurked the issue of sustainability. That being the small problem of how exactly do we power and cool all of these machines driving today's machine learning training and inference.
It's a point that came up repeatedly during CEO Jensen Huang's GTC keynote – see below – in which he hammered on just how inefficient and expensive it is to run these workloads on general purpose servers.
There's merit to his argument. While not every workload is easily parallelized, those can tend to be considerably more efficient. There's a reason why GPU-accelerated supercomputers dominate the Green500.
"Cloud computing has grown 20 percent annually into a massive $1 trillion industry. Some 30 million CPU servers do the majority of the processing," Huang said. "As 'Moore's Law' ends, increasing CPU performance comes with increased power, and the mandate to decrease carbon emissions is fundamentally at odds with the need to increase datacenters."
Acceleration, he argues, is the only way to reduce power consumption. But it's not that Nvidia just wants to sell you more GPUs at a time when demand is down across multiple segments, the company has a whole portfolio of accelerators and software to use with them that it wants to sell you too. Nvidia has GPUs dedicated to AI training, others designed for inferencing, visualization, and video processing, plus networking and data processing kit to tie them all together.
"Datacenters must accelerate every workload to reclaim power and free GPUs for revenue generating workloads and Nvidia's Bluefield offloads and accelerates the datacenter operating system and infrastructure software," Huang said in a pitch for the company's Bluefield-3 DPUs.
And for workloads that can't be accelerated using GPUs or DPUs, Nvidia has its Arm-based Grace CPU. "The entire 144-core Grace Superchip is so low power that it can be air cooled," Huang said, holding up the chip's 5x8-inch 1U heatsink.
For the record, the Grace Superchip, which does include 1TB of LPDDR memory, still has a rated thermal design power (TDP) of 500W, so it's not exactly the lowest power part out there. For comparison, AMD's 96-core/192-thread Epyc 4 has a configurable TDP of 360W-400W. Performance on different workloads, not TDP, will therefore determine an efficiency contest between the two to a greater extent than TDP alone.
- Nvidia hooks TMSC, ASML, Synopsys on GPU accelerated lithography
- Nvidia's generative AI inferencing card is just two H100s glued together
- Unless things change, first zettaflop systems will need nuclear power, AMD's Su says
- Years late and 36 cores short of AMD, who are Intel's 4th-gen Xeons even for?
Having built the hardware, its clear that Nvidia's next step is to go after and build software libraries to accelerate workloads in every industry that relies on CPUs today — in effect creating new markets for its hardware. The company's new computational lithography libraries are evidence of just that, with Huang claiming that TSMC could replace its 40,000 CPU nodes with as few as 500 DGX H100 servers while cutting power consumption from 35MW to 5MW in the process.
The field is more crowded than ever
If all of this sounds familiar, that's because AMD CEO Lisa SU hit on many of the same points while speaking at the International Solid-State Circuits Conference in late February.
During her keynote Su warned that unless drastic steps are taken within the decade to improve the efficiency of compute architectures, the world's most powerful supercomputers wouldn't just simulate nuclear reactions, they'd have to run on them.
AMD's answer to this involves a variety of compute architectures and design principles, including several, like chiplets, that it pioneered. Using these technologies AMD is working to embed accelerators of all kinds into its platforms. For example, the company's upcoming MI300 APU melds Zen 4 CPU cores with its CDNA3 GPUs and a boatload of high bandwidth memory.
The x86 contender is also working to build FPGAs, AI accelerators, and DPUs, acquired from Xilinx and Pensando, into their chips to accelerate more workloads and improve compute efficiency.
And not to be left out, a growing portion of Intel's 4th-Gen Xeon dies are already consumed by dedicated accelerators for machine learning, cryptography, compression, data-streaming, analytics, and security.
Intel is also working on an APU of its own — though it prefers the term XPU — called Falcon Shores, though that platform has been delayed until at least 2025 in what's become a trend for the ailing chipmaker.
Nvidia is therefore far from the only chipmaker advocating more use of accelerators. ®