HPC

Los Alamos to power up supercomputer using all-Nvidia CPU, GPU Superchips

HPE-built system to be used by Uncle Sam for material science, renewables, and more


Nvidia will reveal more details about its Venado supercomputer project today at the International Supercomputing Conference in Hamburg, Germany.

Venado is hoped to be the first in a wave of high-performance computers that use an all-Nvidia architecture, in this case using Grace-Hopper Superchips that combine CPU and GPU dies, and Grace CPU-only Superchips.

This supercomputer "will be the first system deployed not just with Grace-Hopper in terms of the converged Superchip but it’ll also have a cluster of Grace CPU-only Superchip modules,” Dion Harris, Nvidia’s head of datacenter product marketing for HPC, AI, and Magnum IO, said during an Nvidia press conference ahead of ISC.

Built in collaboration with Hewlett Packard Enterprise (HPE) for Los Alamos National Laboratory (LANL), Nvidia claims the system will deliver “10 exaflops of peak AI performance."

First teased in early 2021, Venado is designed to accelerate LANL’s modeling, simulation, and data analysis of material science, renewable energy, and energy distribution.

The Register has a note out to Nvidia to clarify the precision of this "AI performance," whether it is INT8, FP16, or something else. Traditionally, supercomputer performance numbers are given in the context of FP64. The system is nonetheless noteworthy because it shows real-world use cases ahead for the Grace-Hopper CPU/GPU Superchips and open season for the HPC chips and systems world, which has been dominated by Intel/Nvidia or more recently AMD/Nvidia.

Announced at GTC this spring, the Grace-Hopper Superchip is a daughterboard that fuses a 72-core Arm-compatible Grace CPU die with an H100 GPU over the company’s 900 GB/s NVLink-C2C interconnect tech. The Superchip boasts 512GB LPDDR5x DRAM and 80GB of HBM3 video memory.

For workloads that aren’t yet GPU accelerated, Nvidia’s Grace CPU-only Superchip swaps the H100 GPU in favor of a second CPU die for a total of 144 cores and 1TB of DRAM.

Together, LANL will have access to a “true heterogeneous environment that will be built on our platform, and will allow them to use the same programming model across both and get optimum performance across not just their GPU-accelerated apps, but that long tail of non-CPU-accelerated apps,” Harris said.

Nvidia preps for super year of computing

Venado is far from Nvidia’s only supercomputing project in development. The chipmaker’s CPUs and GPUs are at the heart of several upcoming systems, including the Swiss National Supercomputing Centre’s (CSCS) Alps system.

Announced in early 2021, and built in collaboration with HPE, that big beast will replace Piz Daint as a general-purpose research system. And like Venado, it will also use Nvidia’s Grace CPUs when it comes online next year.

Despite now having a complete ecosystem of CPU, GPU, and networking tech, Nvidia isn’t giving up on x86 just yet. The company is also working with the universities of Tsukuba, Japan; Bristol, England; and the Texas Advanced Computer Center (TACC) to develop a wave of x86-based supercomputers using its Hopper H100 GPUs.

What’s more, the successor to Nvidia’s in-house Selene supercomputer, Eos, will also use x86 processors from Intel. The supercomputer is based on Nvidia’s DGX platform and will feature 4,608 H100 GPUs to deliver a claimed 18.4 exaFLOPS of AI computing performance, according to Nvidia. ®

Broader topics

Narrower topics


Other stories you might like

Biting the hand that feeds IT © 1998–2022