This article is more than 1 year old
Nvidia CEO Jensen Huang talks chips, GPUs, metaverse
And 'AI factories'
GTC Nvidia has continued its shift away from primarily emphasizing its consumer GPU business that brought it into the wider market, instead focusing on emerging enterprise opportunities at its GPU Technology Conference, which is being held this week.
During the main keynote, Nvidia CEO Jensen Huang laid out grand plans to marshal creations in the Omniverse – more design and testing in virtual reality than another attempt at Second Life – using a range of hardware devices and software platforms, from the Hopper GPUs, Grace CPUs, a re-engineered network stack, and software tools.
Huang emphasized Nvidia's role as a diversified company with stakes in artificial intelligence, supercomputing, healthcare, automobiles, and software through its technologies.
“Over the past decade, Nvidia computing delivered 1,000,000x speed up in AI and started the modern AI revolution. Now AI will revolutionize all industries,” he said, kicking off the conference.
It’s been a non-trivial year for Nvidia: the plan to acquire Arm washed out, and miscreants broke into Nv's networks, and stole and leaked internal files to punish it for, among other things, limiting crypto-mining on its GPUs.
But a new enterprise GPU and an ambitious Arm-based processor still remained the mainstay of Huang’s trademark long-form keynote, which like always was packed with announcements and demonstrations. Here’s a rundown of the announcements.
New graphics and CPU processors
The Hopper architecture, which is targeted at datacenters, will succeed the previous architecture called Ampere, which was used in both professional and consumer GPU markets. The H100 GPU is the first silicon based on Hopper. It is targeted at applications that include AI, supercomputing, and 3D universes like the metaverse. The H100 is an 80-billion-transistor chip and will be made on TSMC's 4nm process.
Huang said the Hopper H100 provides a nine-times boost in training performance over Nvidia's A100 and thirty times more large-language-model inference throughput.
The H100 is the first PCIe Gen-5 and High-Bandwidth Memory 3 (HBM3) GPU, with 40 terabits per second I/O bandwidth, Huang said.
"Twenty H100s can sustain the equivalent of the entire world's internet traffic," Huang claimed. The GPU has AI engines for transformers to cut down training time from weeks to days. It also has a new set of instructions called DPX for dynamic programming, which speed up complex algorithms like protein folding up to 40 times, we're told.
Huang also announced the Grace CPU Superchip, the company’s first datacenter application processor for high-performance computing. Grace is a 144-CPU-core component consisting of two Arm-based processors interconnected within the same unit via Nvidia’s new NVLink chip-to-chip interconnect technology, and it will support 1TB of LPDDR5x memory.
Grace will have an estimated SPEC 2017 benchmark rate of 740, which is “nothing close to anything that ships today," Huang argued.
“The amazing thing is the entire module, including 1TB of memory, is only 500 Watts. We expect the Grace Superchip to be the highest performance and twice the energy efficiency of the best CPU at the time,” he added.
The Grace Superchip will complement the previously announced Grace Hopper Superchip, which combines one Grace CPU processor and one Hopper GPU within a single unit connected via NVLink. This chip is designed for large-scale AI and HPC applications.
The new GPU and CPU chips are major pieces in the company’s efforts to create AI-focused computers and the graphical plumbing for a metaverse through hardware and software.
Huang floated the idea of "AI factories" built on the Hopper GPU and its other homegrown hardware, by which he appeared to mean companies can use the equipment to manufacture machine-learning models from their silos of data. These models can, it's hoped, help staff and execs make better business decisions and generate savings, especially as those companies scale up compute.
“AI applications like speech, conversation, customer service, and recommenders are driving fundamental changes in data-center design. AI datacenters process mountains of continuous data to train and refine AI models. Raw data comes in, is refined and intelligence goes out,” Huang said.
Nvidia's CEO also detailed the new NVLink interconnect used for Grace and Hopper, which will be used to connect future Nvidia silicon, including CPUs, GPUs, DPUs and SoCs. The company is also opening up NVLink to partners that want to make custom chips using the tech.
Nv also announced new supercomputers with the H100 GPU: DGX H100, which has eight H100s, 32 petaflops of AI performance at FP8 precision, 640GB of HBM3 memory, and 24 terabytes per second of memory bandwidth. The company also announced the DGX Pod and DGX SuperPOD.
GTC rarely disappoints for the supercomputing set. Nvidia announced a new supercomputer based on its new hardware called EOS, which Huang called Nvidia’s “first Hopper AI factory.” The system delivers 275 petaflops of FP64 (double-precision as per most HPC applications) performance, and for AI 18.4 exaflops at FP8, or 9 EFLOPS at FP16. The supercomputer will be up in a few months and be more of a showcase of the H100 hardware for customers, which will include all major OEMs, Huang said.
Much of the keynote focused on the Omniverse, which is the company's platform for building parallel 3D universes. Most of Nvidia’s GPUs, software stacks, and AI models for graphical-driven interfaces come together in Omniverse, which is the company’s platform to deliver a 3D version of the Internet.
Nvidia’s full-speed approach to a metaverse-like future is in contrast to rivals Intel, Qualcomm, and AMD, which are approaching the concept with caution given it is already being dismissed by critics as vaporware.
The keynote highlighted multiple Omniverse efforts underway, including the testing of robots virtually in a 3D space, and simulating global climate change using Earth-2, a supercomputer being built by Nvidia.
"Scientists predict that a supercomputer a billion times larger than today's is needed to effectively simulate regional climate change. Yet it is vital to predict now the impact of our industrial decisions and the effectiveness of mitigation and adaptation strategies,” Huang said.
Earth-2 is the “world's first AI digital twins supercomputer and invent new AI and computing technologies to give us 1,000,000,000x boost before it's too late,” Huang claimed.
Conclusions drawn from those models will be based on probabilities determined by artificial intelligence models running on Nvidia's GPUs.
For Omniverse, Huang announced the Nvidia OVX systems, which will run large-scale simulations with multiple systems directly interacting with each other in real time.
The OVX hardware is anchored by a 400Gbps network platform called Spectrum-4, which includes a switch family, the ConnectX-7 SmartNIC, the BlueField-3 data-processing unit, and DOCA data-center infrastructure software. The Spectrum-4 platform has 100 billion transistors and will be made on TSMC’s 4nm process.
The company also announced the Omniverse Cloud for those who can't afford the hardware but want to create for the metaverse.
Robots and cars
Nvidia announced a computer for cars called Hyperion 9, which has the Drive Atlan system-on-chip. This will be twice as fast as today's Hyperion 8 computers based on the Orin SoC, we're told. The Hyperion 9 computers using the latest silicon will ship in 2026.
Hyperion 9 can run 14 cameras, 9 radars, 3 lidars, and 20 ultrasonic sensors, and can process twice the amount of sensor data compared to Hyperion 8, the CEO said.
In the meantime, Hyperion 8 computers will be used in Mercedes-Benz vheicles starting in 2024, and in vehicles by Jaguar Land Rover the following year. Nvidia has previously estimated roughly 10 million cars with Hyperion 8 computers will hit the road.
Nvidia expects to pocket revenues from software updates to the autonomous vehicles, and hardware upgrades throughout the life of a car. Other customers for Drive computers include EV maker BYD and Lucid Motors.
Nvidia’s automotive pipeline has increased to over $11 billion over the next six years, the company said.
The biz is also building an Earth-scale digital twin for autonomous driving systems to explore with experimental algorithms and designs, and test software before deployment to a fleet. The system uses a multi-modal map engine that creates an accurate 3D representation of the world. The map is loaded into Omniverse, which then allows simulation of autonomous driving to identify objects, road intersections, and pedestrians.
The goal is to make autonomous driving AI models more accurate via virtual simulation. “Each dynamic object can be animated or assigned an AI behavior model,” Huang noted.
He also talked about how Nvidia was accelerating the use of AI in medical applications via Clara Holoscan, a platform that includes a software development kit. The Holoscan development platform, which is already available to select customers, will become generally available in May. The "medical-grade readiness" of Holoscan will come in the first quarter of 2023.
On the robot side, Huang announced the Isaac Nova Orin hardware-and-software platform that provides the computing and sensory needs to develop autonomous mobile robots. The platform is based on the Jetson AGX Orin development board. Isaac is focused on moving robots, while another robotics offering called Metropolis is targeted at the development of stationary machines that track moving objects.
The Nova autonomous mobile robot platform will be available in the second quarter, Huang said. It has two cameras, two lidars, eight ultrasonic sensors and four fish-eye cameras. Nvidia can already simulate robot training in virtual environments via its Isaac SIM software stack.
Nvidia is topping off its crown jewels of chips with a healthy serving of software on top from which the company hopes to generate more revenue in the future. Huang highlighted some of the US corp's software efforts during the keynote, which includes 60 software development kit and framework updates.
Nvidia's AI platform, which is being used by 25,000 companies worldwide, is getting updates that include the Triton Inference Server, which Jensen called the “Grand Central Station of AI deployment training deploys models on every generation of Nvidia GPUs, x86 and Arm CPUs.”
Nvidia’s AI backends include Riva, Maxine, Nemo, and Merlin libraries, which are specialized framework and pretrained models.
The company announced general availability of Riva 2.0, which has speech recognition in seven languages, and neural text-to-speech models with male and female voices, and it can be tuned to the company’s Tao toolkit, which allows the transfer of learned features from existing neural networks to new ones.
The company also announced the release of the 1.0 version of Merlin, a framework for building large-scale deep learning recommender systems. The company also announced the AI Accelerated program for engineers to collaborate on building AI solutions together.
Nvidia is also updating the NeMo Megatron framework for training large language models, and the Maxine framework to enhance audio and video quality in markets that include telecommunications.
Huang also touted CuQuantum on DGX for simulation of quantum computing via GPUs. He also announced a new AI framework for the development of 6G networks. ®