Nvidia wants to lure you to the Arm side with fresh server bait

GPU giant promises big advancements with Arm-based Grace CPU, says the software is ready


Interview 2023 is shaping up to become a big year for Arm-based server chips, and a significant part of this drive will come from Nvidia, which appears steadfast in its belief in the future of Arm, even if it can't own the company.

Several system vendors are expected to push out servers next year that will use Nvidia's new Arm-based chips. These consist of the Grace Superchip, which combines two of Nvidia's Grace CPUs, and the Grace-Hopper Superchip, which brings together one Grace CPU with one Hopper GPU.

The vendors lining up servers include American companies like Dell Technologies, HPE and Supermicro, as well Lenovo in Hong Kong, Inspur in China, plus ASUS, Foxconn, Gigabyte, and Wiwynn in Taiwan are also on board. The servers will target application areas where high performance is key: AI training and inference, high-performance computing, digital twins, and cloud gaming and graphics.

While Nvidia has vowed to continue using x86 CPUs from Intel and AMD in the future, the chip designer is hoping to lure datacenter operators and developers to the Arm side with the promise of some major advancements over x86 chips currently in the market.  

These advancements include 144 cores, up to 1TB of error-correcting LPDDR5x memory and as much as 1TB/s of memory bandwidth in a single socket for the Grace Superchip. To let the Superchip's two CPUs communicate, Nvidia is using its 900GB/s NCLink-C2C interconnect tech, which is also being used to connect the CPU and GPU inside the Grace Hopper Superchip.

"What Grace allows us is to push the boundaries of innovations and address the gaps that are there in the market," Paresh Kharya, Nvidia's director of datacenter computing, told The Register.

He claimed the 900GB/s interconnect speed is seven times faster than the PCIe Gen 5 technology that will debut with the upcoming Sapphire Rapids server chips from Intel and Genoa server chips. "There's nothing else out there that matches close to the speed," he said.

Kharya brought some other major claims about the Arm-based Superchips coming from Nvidia, including 2x higher energy efficiency for the memory subsystem thanks to the use of LPDDR5x and 2x faster memory bandwidth compared to systems currently available in the market.

Nvidia has also teased how a system with the Grace Superchip will perform when it comes to CPU-bound tasks: an estimated score of 740 on the SPECrate 2017_int_base benchmark, according, of course, to their own benchmarks. If we go with their numbers, that would make the system 50 percent faster than the CPU capabilities of Nvidia's DGX A100 system, which uses two 64-core AMD Epyc 7742 processors that came out in 2019.

Kharya said Nvidia compared the Grace Superchip to an x86 processor from three years ago because it considers the DGX A100 the "top of the line server" available today for AI applications.

"So we really love all the innovation that comes to the market from x86 CPUs, and we and our customers are able to take advantage of all of that, but at the same time having now Grace in our portfolio, we are able to push the boundaries of innovation and fill in the gaps," he said.

But to take advantage of these capabilities, datacenter operators and developers will need to make a big leap from the comfortable world of x86 systems to the interesting world of Arm servers.

It may seem like a big leap, but Kharya said Nvidia has done a lot of groundwork in partnership with Arm to prepare the server software ecosystem. This started back in 2019, when the GPU giant announced that it would expand support for the CUDA programming model along with its "full stack of AI and HPC software" to Arm-based server CPUs. Since then, Nvidia has made more of its software compatible.

"We announced our CUDA on Arm project a while ago, 2019, and we've been on a constant journey towards that. All of our key stacks support Arm, and these include our AI platform, Nvidia AI, our Omniverse platform for digital twins as well our Nvidia HPC platform. So we're working with the entire ecosystem to ensure readiness," Kharya said.

The company is also making sure Arm-based servers will provide the best possible performance through its Nvidia-Certified Systems program, which already includes GPU servers in the market now that use Ampere Computing's Arm-based Altra chips.

Some organizations have already announced plans to use servers with Nvidia's Grace and Grace Hopper Superchips, including the US Department of Energy's Los Alamos National Laboratory, which will use both chips for its next-generation Venado supercomputer.

But the true test will play out over the next few years as Nvidia tries to convince the datacenter world of Arm's differentiation and as organizations start putting the company's server designs through their paces. ®


Other stories you might like

  • Arm says its Cortex-X3 CPU smokes this Intel laptop silicon
    Chip design house reveals brains of what might be your next ultralight notebook

    Arm has at least one of Intel's more capable mainstream laptop processors in mind with its Cortex-X3 CPU design.

    The British outfit said the X3, revealed Tuesday alongside other CPU and GPU blueprints, is expected to provide an estimated 34 percent higher peak performance than a performance core in Intel's upper mid-range Core i7-1260P processor from this year.

    Arm came to that conclusion, mind you, after running the SPECRate2017_int_base single-threaded benchmark in a simulation of its CPU core design clocked at an equivalent to 3.6GHz with 1MB of L2 and 16MB of L3 cache.

    Continue reading
  • Arm jumps on ray tracing bandwagon with beefy GPU design
    British chip designer’s reveal comes months after mobile RT moves by AMD, Imagination

    Arm is beefing up its role in the rapidly-evolving (yet long-standing) hardware-based real-time ray tracing arena.

    The company revealed on Tuesday that it will introduce the feature in its new flagship Immortalis-G715 GPU design for smartphones, promising to deliver graphics in mobile games that realistically recreate the way light interacts with objects.

    Arm is promoting the Immortalis-G715 as its best mobile GPU design yet, claiming that it will provide 15 percent faster performance and 15 percent better energy efficiency compared to the currently available Mali-G710.

    Continue reading
  • AMD touts big datacenter, AI ambitions in CPU-GPU roadmap
    Epyc future ahead, along with Instinct, Ryzen, Radeon and custom chip push

    After taking serious CPU market share from Intel over the last few years, AMD has revealed larger ambitions in AI, datacenters and other areas with an expanded roadmap of CPUs, GPUs and other kinds of chips for the near future.

    These ambitions were laid out at AMD's Financial Analyst Day 2022 event on Thursday, where it signaled intentions to become a tougher competitor for Intel, Nvidia and other chip companies with a renewed focus on building better and faster chips for servers and other devices, becoming a bigger player in AI, enabling applications with improved software, and making more custom silicon.  

    "These are where we think we can win in terms of differentiation," AMD CEO Lisa Su said in opening remarks at the event. "It's about compute technology leadership. It's about expanding datacenter leadership. It's about expanding our AI footprint. It's expanding our software capability. And then it's really bringing together a broader custom solutions effort because we think this is a growth area going forward."

    Continue reading
  • Nvidia taps Intel’s Sapphire Rapids CPU for Hopper-powered DGX H100
    A win against AMD as a much bigger war over AI compute plays out

    Nvidia has chosen Intel's next-generation Xeon Scalable processor, known as Sapphire Rapids, to go inside its upcoming DGX H100 AI system to showcase its flagship H100 GPU.

    Jensen Huang, co-founder and CEO of Nvidia, confirmed the CPU choice during a fireside chat Tuesday at the BofA Securities 2022 Global Technology Conference. Nvidia positions the DGX family as the premier vehicle for its datacenter GPUs, pre-loading the machines with its software and optimizing them to provide the fastest AI performance as individual systems or in large supercomputer clusters.

    Huang's confirmation answers a question we and other observers have had about which next-generation x86 server CPU the new DGX system would use since it was announced in March.

    Continue reading
  • Intel is running rings around AMD and Arm at the edge
    What will it take to loosen the x86 giant's edge stranglehold?

    Analysis Supermicro launched a wave of edge appliances using Intel's newly refreshed Xeon-D processors last week. The launch itself was nothing to write home about, but a thought occurred: with all the hype surrounding the outer reaches of computing that we call the edge, you'd think there would be more competition from chipmakers in this arena.

    So where are all the AMD and Arm-based edge appliances?

    A glance through the catalogs of the major OEMs – Dell, HPE, Lenovo, Inspur, Supermicro – returned plenty of results for AMD servers, but few, if any, validated for edge deployments. In fact, Supermicro was the only one of the five vendors that even offered an AMD-based edge appliance – which used an ageing Epyc processor. Hardly a great showing from AMD. Meanwhile, just one appliance from Inspur used an Arm-based chip from Nvidia.

    Continue reading
  • Lenovo reveals small but mighty desktop workstation
    ThinkStation P360 Ultra packs latest Intel Core processor, Nvidia RTX A5000 GPU, support for eight monitors

    Lenovo has unveiled a small desktop workstation in a new physical format that's smaller than previous compact designs, but which it claims still has the type of performance professional users require.

    Available from the end of this month, the ThinkStation P360 Ultra comes in a chassis that is less than 4 liters in total volume, but packs in 12th Gen Intel Core processors – that's the latest Alder Lake generation with up to 16 cores, but not the Xeon chips that we would expect to see in a workstation – and an Nvidia RTX A5000 GPU.

    Other specifications include up to 128GB of DDR5 memory, two PCIe 4.0 slots, up to 8TB of storage using plug-in M.2 cards, plus dual Ethernet and Thunderbolt 4 ports, and support for up to eight displays, the latter of which will please many professional users. Pricing is expected to start at $1,299 in the US.

    Continue reading
  • Train once, run anywhere, almost: Qualcomm's drive to bring AI to its phone, PC chips
    Software toolkit offered to save developers time, effort, battery power

    Qualcomm knows that if it wants developers to build and optimize AI applications across its portfolio of silicon, the Snapdragon giant needs to make the experience simpler and, ideally, better than what its rivals have been cooking up in the software stack department.

    That's why on Wednesday the fabless chip designer introduced what it's calling the Qualcomm AI Stack, which aims to, among other things, let developers take AI models they've developed for one device type, let's say smartphones, and easily adapt them for another, like PCs. This stack is only for devices powered by Qualcomm's system-on-chips, be they in laptops, cellphones, car entertainment, or something else.

    While Qualcomm is best known for its mobile Arm-based Snapdragon chips that power many Android phones, the chip house is hoping to grow into other markets, such as personal computers, the Internet of Things, and automotive. This expansion means Qualcomm is competing with the likes of Apple, Intel, Nvidia, AMD, and others, on a much larger battlefield.

    Continue reading

Biting the hand that feeds IT © 1998–2022