HPE goes Cray for Nvidia's Blackwell GPUs, crams 224 into a single cabinet
Meanwhile, new ProLiant servers offer choice of Gaudi, Hopper, Instinct acceleration
If you thought Nvidia's 120 kW NVL72 racks were compute dense with 72 Blackwell accelerators, they have nothing on HPE Cray's latest EX systems, which will pack more than three times as many GPUs into a single cabinet.
Announced ahead of next week's Super Computing conference in Atlanta, Cray's EX154n platform will support up to 224 Nvidia Blackwell GPUs and 8,064 Grace CPU cores per cabinet. That works out to just over 10 petaFLOPS at FP64 for HPC applications or over 4.4 exaFLOPS of FP4 for sparse AI and machine learning workloads, where precision usually isn't as big a deal.
One rack. 120kW of compute. Taking a closer look at Nvidia's DGX GB200 NVL72 beast
READ MORESpecifically, each EX154n accelerator blade will feature a pair of 2.7 kW Grace Blackwell Superchips (GB200), each of which is equipped with two Blackwell GPUs and a single 72-core Arm CPU. Those two Superchips will be interconnected by Nvidia's NVL4 reference configuration.
At a rack level, the compute alone will consume upwards of 300 kW, so it goes without saying that, just like past EX systems, HPE's Blackwell blades will be liquid cooled.
In fact, these systems are completely fanless right down to the all-new Slingshot 400 family of Ethernet NICs, cables, and switches. As the name suggests, Slingshot 400 represents a welcome upgrade over its predecessor, pushing bandwidth from 200 to 400 Gbps, bringing it in line with current-gen Ethernet and InfiniBand networking.
HPE's prior-gen Slingshot 200 interconnects have become a mainstay of large-scale supercomputing platforms and are at the heart of the Frontier, Aurora, and Lumi machines to name just a handful.
Unfortunately, anyone looking to get their hands on Cray's super-dense Blackwell systems and speedy Slingshot 400 networking will have to wait a while. Neither are expected to ship until late in 2025.
If conventional CPU-based HPC is more your thing, Cray's fifth-gen Epyc-based EX4252 Gen 2 compute blades are due out next spring and will pack up to eight 192-core Turin-C processors for a total of 98,304 cores per cabinet.
Cray will also begin shipping upgraded E2000 storage systems, which it claims will more than double the I/O performance over prior generations thanks to faster PCIe 5.0-based NVMe storage. HPE expects to start shipping these storage arrays beginning early 2025.
- The Register takes AMD's Ryzen 9800X3D for a spin
- Dow swaps Intel for Nvidia leaving no index free from wild AI volatility
- Fujitsu, AMD lay groundwork to pair Monaka CPUs with Instinct GPUs
- xAI picked Ethernet over InfiniBand for its H100 Colossus training cluster
While HPE's Cray EX Platforms promise greater density than a typical server or rack, they aren't exactly the kind of systems that can be deployed in your average datacenter. So HPE is also rolling out a pair of new air-cooled ProLiant Compute servers, which make use of its enterprise-focused iLO lights-out management system.
These systems will be fairly familiar to anyone who's ever seen an Nvidia HGX platform with both XD680 and XD685 servers boasting support for eight accelerators of your choice.
Surprisingly, we aren't limited to just Nvidia and AMD GPUs as you might expect. The XD680 actually comes standard with eight Intel Gaudi3 accelerators totaling 1 TB of HBM2e. As we reported in spring, Gaudi3 is quite competitive with the current crop of accelerators. Each is capable of churning out 1.8 petaFLOPS of dense BF16 performance, giving it an edge in compute-bound workloads over the H100, H200, and AMD's MI300X.
Stepping up to HPE's XD685, you have the choice of either eight Nvidia H200s with a combined 1.1 TB of HBM3e or the upcoming Blackwell GPUs – presumably B200 – which should boost memory capacity to 1.5 TB. The former is due out in early 2025, while timing for the Blackwell-based systems remains rather vague.
If Nvidia isn't your style, or you need more memory, HPE is also rolling out a version of the system with AMD's newly launched MI325X. That system, announced alongside the accelerator in October, will boast up to 2 TB of HBM3e memory on board and is set to ship in the first quarter of 2025. ®