Nvidia reveals 144-core Arm-based Grace 'CPU Superchip'
Yeah, right, who needs a takeover?
GTC Nvidia is cramming two Arm-based CPU processors and a DDR5 memory subsystem into one unit, its newly revealed 144-core Grace "CPU Superchip," which the graphics giant claims will be much faster and 2x more energy-efficient than two of AMD's best Epyc processors combined.
This Grace Superchip, unveiled Tuesday at Nv's virtual GTC 2022 event, represents Nvidia's first swing at CPU-only systems for AI and HPC applications. It is expected to hit the market in the first half of 2023 loaded with support for all of Nvidia's software, including the trendy Omniverse platform, which we will hear much more about this week.
You may be thinking, didn't Nvidia announce this processor at last year's GTC? Not exactly. The chip designer did reveal last year what it now calling the Grace Hopper "Superchip," which combines one Grace CPU processor and one GPU based on the new Hopper architecture for large-scale HPC and AI applications.
Thus, the Grace Superchip is two Grace CPU processors in one unit including DDR5 RAM, while the Grace Hopper Superchip can be a Grace CPU processor and a Hopper GPU.
Getting to production with the new chips does not require owning Arm, of course, which is good news since the Nvidia-Arm acquisition failed. Instead, the GPU giant is licensing Arm's server-tinged Neoverse core blueprints for the Grace CPU cores. Nv said the Grace processor is based on the Armv9 architecture, which makes us believe that Grace is using Arm's new N2 cores, the only Neoverse design using Armv9 that has been announced to date.
But back to the 144-CPU-core Grace Superchip, which Nvidia said is faster than "today's leading server chips" in addition to having twice the energy efficiency and memory bandwidth. In this case, Nvidia is comparing the chip to two 64-core, 225-watt AMD Epyc 7742 processors, which provide the x86 CPU muscle for Nvidia's DGX A100 systems.
Nvidia said a system running a single Grace processor can achieved an estimated score of 740 on the SPECrate 2017_int_base benchmark that is used to measure CPU performance. That makes it 50 percent faster than the CPU horsepower of the AMD Epyc-based based DGX A100, according to the company. However, the AMD Epyc 7742 was released all the way back in 2019 and is one generation behind AMD's current Epyc lineup. Plus, AMD is planning to release its next-generation Epyc processors, code-named Genoa, this year, so we’ll have to wait and see how Grace holds up against two generations of Epyc improvements.
"The Grace CPU Superchip will excel at the most demanding HPC, AI, data analytics, scientific computing and hyperscale computing applications with its highest performance, memory bandwidth, energy efficiency and configurability," Nvidia said.
Not all the details for the Grace chip have been revealed but Paresh Kharya, Nvidia's director of datacenter computing, told The Register that, by bringing together two Arm Neoverse-based CPU processors, it will fit 144 CPU cores into a single socket in a server, which is the only kind of socket configuration Nvidia is considering for now.
The superchip includes a memory subsystem with up to 1TB of LPDDR5x memory and error correction code capabilities to provide "the best balance of speed and power consumption." The company said this low-power memory subsystem will deliver 1TBps of bandwidth, double that of traditional DDR5 designs.
- Why Nvidia sees a future in software and services: Recurring revenue
- 'We gave it our best shot' Nvidia CEO tells Wall Street after failed Arm deal
- Nvidia, Apple noticeably absent from Intel-led chiplet interconnect collaboration
With the two CPU processors and memory subsystem combined, the Grace Superchip consumes "only" 500 Watts of power, Nvidia claims. At first blush, this seems like a strange "only," given that two AMD Epyc 7742s combined would represent 450 Watts at full power, but Kharya said that 450 Watts doesn't include the power required for the system memory, which is separate in the DGX A100. It's this consideration that makes the Grace Superchip's 500 Watts more attractive from a power efficiency standpoint, he added.
The Grace Superchip's two CPU processors are connected within the package by a high-speed, low-latency, chip-to-chip interconnect called the NVLink-C2C, which is also used to connect the GPU and CPU in the Grace Hopper Superchip. Nv plans to offer custom silicon integration services to businesses with the NVLink-C2C interconnect, which it will use to design custom chips that connect silicon designs from other companies with Nvidia's portfolio of processors, including GPUs, DPUs and CPUs.
Nvidia said it is working with "leading HPC, supercomputing, hyperscale and cloud customers" for the Grace superchip, and Kharya clarified that the company is open to "all the levels of integration and engagement" with both server makers and cloud providers. He added that the Grace chip is designed to work in a variety of configurations: from CPU-only servers to servers with multiple discrete Hopper GPUs.
During his keynote for GTC, Nvidia CEO Jensen Huang expanded on the possibilities of Grace and Hopper mashups, and said his company can also create a Superchip with one Grace CPU processor and two Hopper GPUs. In the future, he said, his biz will expand the use of its NVLink-C2C interconnect to work with other kinds of chips that integrate its CPUs, GPUs, DPUs, network interface cards, and system-on-chips.
Kharya said the new Grace superchip will create new opportunities for Nvidia in HPC applications and other areas that don't yet benefit from the acceleration capabilities of GPUs.
"There's a class of HPC applications, a long tail of HPC applications that have not yet been accelerated. Those applications would immediately benefit from the high performance here, but also applications in data analytics and in hyperscale computing, where you really need high performance cores and high memory bandwidth and energy efficiency. Those all would benefit from this design," he said. ®