Nvidia reveals 144-core Arm-based Grace 'CPU Superchip'

Yeah, right, who needs a takeover?


GTC Nvidia is cramming two Arm-based CPU processors and a DDR5 memory subsystem into one unit, its newly revealed 144-core Grace "CPU Superchip," which the graphics giant claims will be much faster and 2x more energy-efficient than two of AMD's best Epyc processors combined.

This Grace Superchip, unveiled Tuesday at Nv's virtual GTC 2022 event, represents Nvidia's first swing at CPU-only systems for AI and HPC applications. It is expected to hit the market in the first half of 2023 loaded with support for all of Nvidia's software, including the trendy Omniverse platform, which we will hear much more about this week.

You may be thinking, didn't Nvidia announce this processor at last year's GTC? Not exactly. The chip designer did reveal last year what it now calling the Grace Hopper "Superchip," which combines one Grace CPU processor and one GPU based on the new Hopper architecture for large-scale HPC and AI applications.

Thus, the Grace Superchip is two Grace CPU processors in one unit including DDR5 RAM, while the Grace Hopper Superchip can be a Grace CPU processor and a Hopper GPU.

A rendering of Nvidia's new Grace 'CPU Superchip'

Nvidia's rendering of its Grace Superchip: Two processors in one with RAM ... Click to enlarge

Getting to production with the new chips does not require owning Arm, of course, which is good news since the Nvidia-Arm acquisition failed. Instead, the GPU giant is licensing Arm's server-tinged Neoverse core blueprints for the Grace CPU cores. Nv said the Grace processor is based on the Armv9 architecture, which makes us believe that Grace is using Arm's new N2 cores, the only Neoverse design using Armv9 that has been announced to date.

But back to the 144-CPU-core Grace Superchip, which Nvidia said is faster than "today's leading server chips" in addition to having twice the energy efficiency and memory bandwidth. In this case, Nvidia is comparing the chip to two 64-core, 225-watt AMD Epyc 7742 processors, which provide the x86 CPU muscle for Nvidia's DGX A100 systems.

Nvidia said a system running a single Grace processor can achieved an estimated score of 740 on the SPECrate 2017_int_base benchmark that is used to measure CPU performance. That makes it 50 percent faster than the CPU horsepower of the AMD Epyc-based based DGX A100, according to the company. However, the AMD Epyc 7742 was released all the way back in 2019 and is one generation behind AMD's current Epyc lineup. Plus, AMD is planning to release its next-generation Epyc processors, code-named Genoa, this year, so we’ll have to wait and see how Grace holds up against two generations of Epyc improvements.

"The Grace CPU Superchip will excel at the most demanding HPC, AI, data analytics, scientific computing and hyperscale computing applications with its highest performance, memory bandwidth, energy efficiency and configurability," Nvidia said.

Not all the details for the Grace chip have been revealed but Paresh Kharya, Nvidia's director of datacenter computing, told The Register that, by bringing together two Arm Neoverse-based CPU processors, it will fit 144 CPU cores into a single socket in a server, which is the only kind of socket configuration Nvidia is considering for now.

The superchip includes a memory subsystem with up to 1TB of LPDDR5x memory and error correction code capabilities to provide "the best balance of speed and power consumption." The company said this low-power memory subsystem will deliver 1TBps of bandwidth, double that of traditional DDR5 designs.

With the two CPU processors and memory subsystem combined, the Grace Superchip consumes "only" 500 Watts of power, Nvidia claims. At first blush, this seems like a strange "only," given that two AMD Epyc 7742s combined would represent 450 Watts at full power, but Kharya said that 450 Watts doesn't include the power required for the system memory, which is separate in the DGX A100. It's this consideration that makes the Grace Superchip's 500 Watts more attractive from a power efficiency standpoint, he added.

The Grace Superchip's two CPU processors are connected within the package by a high-speed, low-latency, chip-to-chip interconnect called the NVLink-C2C, which is also used to connect the GPU and CPU in the Grace Hopper Superchip. Nv plans to offer custom silicon integration services to businesses with the NVLink-C2C interconnect, which it will use to design custom chips that connect silicon designs from other companies with Nvidia's portfolio of processors, including GPUs, DPUs and CPUs.

Nvidia said it is working with "leading HPC, supercomputing, hyperscale and cloud customers" for the Grace superchip, and Kharya clarified that the company is open to "all the levels of integration and engagement" with both server makers and cloud providers. He added that the Grace chip is designed to work in a variety of configurations: from CPU-only servers to servers with multiple discrete Hopper GPUs.

During his keynote for GTC, Nvidia CEO Jensen Huang expanded on the possibilities of Grace and Hopper mashups, and said his company can also create a Superchip with one Grace CPU processor and two Hopper GPUs. In the future, he said, his biz will expand the use of its NVLink-C2C interconnect to work with other kinds of chips that integrate its CPUs, GPUs, DPUs, network interface cards, and system-on-chips.

Kharya said the new Grace superchip will create new opportunities for Nvidia in HPC applications and other areas that don't yet benefit from the acceleration capabilities of GPUs.

"There's a class of HPC applications, a long tail of HPC applications that have not yet been accelerated. Those applications would immediately benefit from the high performance here, but also applications in data analytics and in hyperscale computing, where you really need high performance cores and high memory bandwidth and energy efficiency. Those all would benefit from this design," he said. ®

Broader topics


Other stories you might like

  • Despite 'key' partnership with AWS, Meta taps up Microsoft Azure for AI work
    Someone got Zuck'd

    Meta’s AI business unit set up shop in Microsoft Azure this week and announced a strategic partnership it says will advance PyTorch development on the public cloud.

    The deal [PDF] will see Mark Zuckerberg’s umbrella company deploy machine-learning workloads on thousands of Nvidia GPUs running in Azure. While a win for Microsoft, the partnership calls in to question just how strong Meta’s commitment to Amazon Web Services (AWS) really is.

    Back in those long-gone days of December, Meta named AWS as its “key long-term strategic cloud provider." As part of that, Meta promised that if it bought any companies that used AWS, it would continue to support their use of Amazon's cloud, rather than force them off into its own private datacenters. The pact also included a vow to expand Meta’s consumption of Amazon’s cloud-based compute, storage, database, and security services.

    Continue reading
  • Atos pushes out HPC cloud services based on Nimbix tech
    Moore's Law got you down? Throw everything at the problem! Quantum, AI, cloud...

    IT services biz Atos has introduced a suite of cloud-based high-performance computing (HPC) services, based around technology gained from its purchase of cloud provider Nimbix last year.

    The Nimbix Supercomputing Suite is described by Atos as a set of flexible and secure HPC solutions available as a service. It includes access to HPC, AI, and quantum computing resources, according to the services company.

    In addition to the existing Nimbix HPC products, the updated portfolio includes a new federated supercomputing-as-a-service platform and a dedicated bare-metal service based on Atos BullSequana supercomputer hardware.

    Continue reading
  • In record year for vulnerabilities, Microsoft actually had fewer
    Occasional gaping hole and overprivileged users still blight the Beast of Redmond

    Despite a record number of publicly disclosed security flaws in 2021, Microsoft managed to improve its stats, according to research from BeyondTrust.

    Figures from the National Vulnerability Database (NVD) of the US National Institute of Standards and Technology (NIST) show last year broke all records for security vulnerabilities. By December, according to pentester Redscan, 18,439 were recorded. That's an average of more than 50 flaws a day.

    However just 1,212 vulnerabilities were reported in Microsoft products last year, said BeyondTrust, a 5 percent drop on the previous year. In addition, critical vulnerabilities in the software (those with a CVSS score of 9 or more) plunged 47 percent, with the drop in Windows Server specifically down 50 percent. There was bad news for Internet Explorer and Edge vulnerabilities, though: they were up 280 percent on the prior year, with 349 flaws spotted in 2021.

    Continue reading
  • ServiceNow takes aim at procurement pain points
    Purchasing teams are a bit like help desks – always being asked to answer dumb or inappropriate questions

    ServiceNow's efforts to expand into more industries will soon include a Procurement Service Management product.

    This is not a dedicated application – ServiceNow has occasionally flirted with templates for its platform that come very close to being apps. Instead it stays close to the company's core of providing workflows that put the right jobs in the right hands, and make sure they get done. In this case, it will do so by tickling ERP and dedicated procurement applications, using tech ServiceNow acquired along with a company called Gekkobrain in 2021.

    The company believes it can play to its strengths with procurements via a single, centralized buying team.

    Continue reading
  • HPE, Cerebras build AI supercomputer for scientific research
    Wafer madness hits the LRZ in HPE Superdome supercomputer wrapper

    HPE and Cerebras Systems have built a new AI supercomputer in Munich, Germany, pairing a HPE Superdome Flex with the AI accelerator technology from Cerebras for use by the scientific and engineering community.

    The new system, created for the Leibniz Supercomputing Center (LRZ) in Munich, is being deployed to meet the current and expected future compute needs of researchers, including larger deep learning neural network models and the emergence of multi-modal problems that involve multiple data types such as images and speech, according to Laura Schulz, LRZ's head of Strategic Developments and Partnerships.

    "We're seeing an increase in large data volumes coming at us that need more and more processing, and models that are taking months to train, we want to be able to speed that up," Schulz said.

    Continue reading
  • We have bigger targets than beating Oracle, say open source DB pioneers
    Advocates for MySQL and PostgreSQL see broader future for movement they helped create

    MySQL pioneer Peter Zaitsev, an early employee of MySQL AB under the original open source database author Michael "Monty" Widenius, once found it easy to identify the enemy.

    "In the early days of MySQL AB, we were there to get Oracle's ass. Our CEO Mårten Mickos was always telling us how we were going to get out there and replace all those Oracle database installations," Zaitsev told The Register.

    Speaking at Percona Live, the open source database event hosted by the services company Zaitsev founded in 2006 and runs as chief exec, he said that situation had changed since Oracle ended up owning MySQL in 2010. This was as a consequence of its acquisition that year of Sun Microsystems, which had bought MySQL AB just two years earlier.

    Continue reading
  • Beijing needs the ability to 'destroy' Starlink, say Chinese researchers
    Paper authors warn Elon Musk's 2,400 machines could be used offensively

    An egghead at the Beijing Institute of Tracking and Telecommunications, writing in a peer-reviewed domestic journal, has advocated for Chinese military capability to take out Starlink satellites on the grounds of national security.

    According to the South China Morning Post, lead author Ren Yuanzhen and colleagues advocated in Modern Defence Technology not only for China to develop anti-satellite capabilities, but also to have a surveillance system that could monitor and track all satellites in Starlink's constellation.

    "A combination of soft and hard kill methods should be adopted to make some Starlink satellites lose their functions and destroy the constellation's operating system," the Chinese boffins reportedly said, estimating that data transmission speeds of stealth fighter jets and US military drones could increase by a factor of 100 through a Musk machine connection.

    Continue reading
  • How to explain what an API is – and why they matter
    Some of us have used them for decades, some are seeing them for the first time on marketing slides

    Systems Approach Explaining what an API is can be surprisingly difficult.

    It's striking to remember that they have been around for about as long as we've had programming languages, and that while the "API economy" might be a relatively recent term, APIs have been enabling innovation for decades. But how to best describe them to someone for whom application programming interfaces mean little or nothing?

    I like this short video from Martin Casado, embedded below, which starts with the analogy of building cars. In the very early days, car manufacturers were vertically integrated businesses, essentially starting from iron ore and coal to make steel all the way through to producing the parts and then the assembled vehicle. As the business matured and grew in size, car manufacturers were able to buy components built by others, and entire companies could be created around supplying just a single component, such as a spring.

    Continue reading

Biting the hand that feeds IT © 1998–2022