Nvidia reveals specs of latest GPU: The Hopper-based H100

Performance boost promised, power stakes raised by 300 Watts


GTC Nvidia has unveiled its H100 GPU powered by its next-generation Hopper architecture, claiming it will provide a huge AI performance leap over the two-year-old A100, speeding up massive deep learning models in a more secure environment.

The new processor is also more power-hungry than ever before, demanding up to 700 Watts for the H100's SXM form factor, which requires Nvidia's custom HGX motherboard, raising the power stakes 300 Watts higher than the thermal design of the chip designer's A100 counterpart.

After months of anticipation, the GPU giant revealed not only the Hopper-fueled H100, but also H100-powered systems and reference architectures, and plenty of other details about the datacenter-grade GPU, during CEO Jensen Huang's keynote at his corporation's virtual GTC 2022 event on Tuesday.

A rendering of Nvidia's new H100 SXM GPU

Nvidia's rendering of its H100 SXM graphics processor ... Click to enlarge

The 700-watt figure sounds like a lot, but Paresh Kharya, Nvidia's director of data center computing, told The Register that the H100 is still more power-efficient, offering over 3x the performance-per-watt of the A100.

Kharya based this off Nvidia's claim that the H100 SXM part, which will be complemented by PCIe form factors when it launches in the third quarter, is capable of four petaflops, or four quadrillion floating-point operations per second, for FP8, the company's new floating-point format for 8-bit math that is its stand-in for measuring AI performance.

This makes the H100 6x faster than the A100, and Kharya said the GPU offers performance multiples across higher levels of floating-point precision: 3x faster for FP16 (2 petaflops), 3x faster for Nvidia's FP32-adjacent TensorFloat32 format (1 petaflop) and 3x faster for FP64 (60 teraflops). Kharya said these stats make the H100 a heavy hitter against competitors in the AI space, including AMD, Cerebras Systems and Graphcore.

"Each H100 will have 4 petaflops of AI computing. Nothing even comes even close to that level of AI performance," he said. "And combined with our software stack and a scalable platform that goes to the full data-center-scale, we are very well-positioned to continue to deliver performance benefits to our customers."

Speeds, feeds and AI scaling dreams

The H100 is made up of 80 billion transistors using a custom 4nm process from TSMC, which Nvidia said makes the GPU the "world's most advanced chip." The GPU's architecture, Hopper, is the successor to 2020's Ampere and is named after US computer science pioneer Grace Hopper, whose first name is being used for Nvidia's first server CPU due in 2023.

Nvidia said the H100 is the first GPU to support PCIe Gen5 connectivity, which doubles the throughput of the previous generation to 128GBps. It's also the first to use the HBM3 high-bandwidth memory specification, sporting 80GB in total memory and delivering up to 3TBps of memory bandwidth, a 50 percent increase over the A100. The GPU is capable of sending data super-fast within the chip too, thanks to its nearly 5TBps of external connectivity.

These advancements are one of six "breakthrough innovations" Nvidia is claiming for the H100, which also include a new Transformer Engine for speeding up the popular deep learning model type that powers many natural language processing workloads.

Kharya said the Transformer Engine, working in conjunction with Nvidia software, "intelligently" manages the precision of transformer models between 8-bit and 16-bit formats while maintaining accuracy to speed up the training of such models by as much 6x compared to the A100. This can reduce the time it takes to train transformer models, which can have as many as 530 billion parameters, from weeks to days, he said.

Another breakthrough comes from Nvidia's implementation of confidential computing for the H100, which the company said is a first time for a GPU. This allows the GPU, working with an Intel or AMD CPU, to create a so-called Trusted Execution Environment in a virtualized environment that is protected from the hypervisor, the operating system or anyone with physical access.

The H100's other breakthroughs include the fourth-generation Nvidia NVLink interconnect, which – when combined with an external NVLink Switch – allows up to 256 H100s to connect over a network at a bandwidth that is 9x higher than the previous generation. The GPU also comes with Nvidia's second generation of Multi-Instance GPU that can now virtualize the throughput and fully isolate each of the feature's seven GPU instances.

The final highlight are the H100's new DPX instructions, which speed up dynamic programming – a method popular for a broad range of algorithms – by as much as 40x compared to CPUs and up to 7x compared to previous-generation GPUs.

Coming to a server – or cloud – near you

Nvidia is hoping to see widespread adoption of the H100 across on-premises data centers, cloud instances and edge servers, and it promises to spur new buying cycles with a long list of server makers and cloud providers, including Amazon Web Services, Cisco, Dell Technologies, Google Cloud, Hewlett Packard Enterprise, Lenovo and Microsoft Azure.

As is tradition now, Nvidia is bringing the H100 to its line of DGX systems, which are pre-loaded with Nvidia software and optimized to provide the fastest AI performance. Appropriately called the DGX H100, the new system will feature eight GPUs, making it capable of delivering 32 petaflops of AI performance with Nvidia's FP8 format, which is 6x faster than the previous generation, according to the company.

A rendering of Nvidia's new DGX H100 system

A rendering of Nvidia's new DGX H100 system

Thanks to Nvidia's new generation of NVLink Switch technology, the company can connect up to 32 DGX H100 systems in a DGX SuperPOD cluster, making it capable of one exaflop of FP8, or one quintillion floating-point calculations per second, according to the company.

The chip designer can even connect multiple 32-system clusters together, and when we say "multiple," we mean quite a lot.

A case in point? Nvidia's newly announced Eos supercomputer, which connects 18 DGX SuperPOD clusters that consist of a total of 576 DGX H100s. The company said this new supercomputer can provide 18 exaflops of FP8, 9 exaflops of FP16 and 275 petaflops of FP64.

When the H100 launches in the third quarter, it will be available in three form factors. The first, SXM, will enable the faster performance for AI training and performance, but it will only be available in servers that use Nvidia's HGX 100 server boards. The second form factor is a PCIe card for mainstream servers, which uses NVLink to connect two GPUs and provide 7x more bandwidth than PCIe Gen5 connectivity, Nvidia said.

The third option is the H100 CNX, a PCIe card that brings the H100 together with a ConnectX-7 SmartNIC from Nvidia's Mellanox acquisition for workloads that need high throughput, such as multi-node AI training for businesses or 5G signal processing in edge environments. ®


Other stories you might like

  • Despite 'key' partnership with AWS, Meta taps up Microsoft Azure for AI work
    Someone got Zuck'd over

    Meta’s AI business unit set up shop in Microsoft Azure this week and announced a strategic partnership it says will advance PyTorch development on the public cloud.

    The deal [PDF] will see Mark Zuckerberg’s umbrella company deploy machine-learning workloads on thousands of Nvidia GPUs running in Azure. While a win for Microsoft, the partnership calls in to question just how strong Meta’s commitment to Amazon Web Services (AWS) really is.

    Back in those long-gone days of December, Meta named AWS as its “key long-term strategic cloud provider." As part of that, Meta promised that if it bought any companies that used AWS, it would continue to support their use of Amazon's cloud, rather than force them off into its own private datacenters. The pact also included a vow to expand Meta’s consumption of Amazon’s cloud-based compute, storage, database, and security services.

    Continue reading
  • Atos pushes out HPC cloud services based on Nimbix tech
    Moore's Law got you down? Throw everything at the problem! Quantum, AI, cloud...

    IT services biz Atos has introduced a suite of cloud-based high-performance computing (HPC) services, based around technology gained from its purchase of cloud provider Nimbix last year.

    The Nimbix Supercomputing Suite is described by Atos as a set of flexible and secure HPC solutions available as a service. It includes access to HPC, AI, and quantum computing resources, according to the services company.

    In addition to the existing Nimbix HPC products, the updated portfolio includes a new federated supercomputing-as-a-service platform and a dedicated bare-metal service based on Atos BullSequana supercomputer hardware.

    Continue reading
  • In record year for vulnerabilities, Microsoft actually had fewer
    Occasional gaping hole and overprivileged users still blight the Beast of Redmond

    Despite a record number of publicly disclosed security flaws in 2021, Microsoft managed to improve its stats, according to research from BeyondTrust.

    Figures from the National Vulnerability Database (NVD) of the US National Institute of Standards and Technology (NIST) show last year broke all records for security vulnerabilities. By December, according to pentester Redscan, 18,439 were recorded. That's an average of more than 50 flaws a day.

    However just 1,212 vulnerabilities were reported in Microsoft products last year, said BeyondTrust, a 5 percent drop on the previous year. In addition, critical vulnerabilities in the software (those with a CVSS score of 9 or more) plunged 47 percent, with the drop in Windows Server specifically down 50 percent. There was bad news for Internet Explorer and Edge vulnerabilities, though: they were up 280 percent on the prior year, with 349 flaws spotted in 2021.

    Continue reading
  • ServiceNow takes aim at procurement pain points
    Purchasing teams are a bit like help desks – always being asked to answer dumb or inappropriate questions

    ServiceNow's efforts to expand into more industries will soon include a Procurement Service Management product.

    This is not a dedicated application – ServiceNow has occasionally flirted with templates for its platform that come very close to being apps. Instead it stays close to the company's core of providing workflows that put the right jobs in the right hands, and make sure they get done. In this case, it will do so by tickling ERP and dedicated procurement applications, using tech ServiceNow acquired along with a company called Gekkobrain in 2021.

    The company believes it can play to its strengths with procurements via a single, centralized buying team.

    Continue reading
  • HPE, Cerebras build AI supercomputer for scientific research
    Wafer madness hits the LRZ in HPE Superdome supercomputer wrapper

    HPE and Cerebras Systems have built a new AI supercomputer in Munich, Germany, pairing a HPE Superdome Flex with the AI accelerator technology from Cerebras for use by the scientific and engineering community.

    The new system, created for the Leibniz Supercomputing Center (LRZ) in Munich, is being deployed to meet the current and expected future compute needs of researchers, including larger deep learning neural network models and the emergence of multi-modal problems that involve multiple data types such as images and speech, according to Laura Schulz, LRZ's head of Strategic Developments and Partnerships.

    "We're seeing an increase in large data volumes coming at us that need more and more processing, and models that are taking months to train, we want to be able to speed that up," Schulz said.

    Continue reading
  • We have bigger targets than beating Oracle, say open source DB pioneers
    Advocates for MySQL and PostgreSQL see broader future for movement they helped create

    MySQL pioneer Peter Zaitsev, an early employee of MySQL AB under the original open source database author Michael "Monty" Widenius, once found it easy to identify the enemy.

    "In the early days of MySQL AB, we were there to get Oracle's ass. Our CEO Mårten Mickos was always telling us how we were going to get out there and replace all those Oracle database installations," Zaitsev told The Register.

    Speaking at Percona Live, the open source database event hosted by the services company Zaitsev founded in 2006 and runs as chief exec, he said that situation had changed since Oracle ended up owning MySQL in 2010. This was as a consequence of its acquisition that year of Sun Microsystems, which had bought MySQL AB just two years earlier.

    Continue reading
  • Beijing needs the ability to 'destroy' Starlink, say Chinese researchers
    Paper authors warn Elon Musk's 2,400 machines could be used offensively

    An egghead at the Beijing Institute of Tracking and Telecommunications, writing in a peer-reviewed domestic journal, has advocated for Chinese military capability to take out Starlink satellites on the grounds of national security.

    According to the South China Morning Post, lead author Ren Yuanzhen and colleagues advocated in Modern Defence Technology not only for China to develop anti-satellite capabilities, but also to have a surveillance system that could monitor and track all satellites in Starlink's constellation.

    "A combination of soft and hard kill methods should be adopted to make some Starlink satellites lose their functions and destroy the constellation's operating system," the Chinese boffins reportedly said, estimating that data transmission speeds of stealth fighter jets and US military drones could increase by a factor of 100 through a Musk machine connection.

    Continue reading
  • How to explain what an API is – and why they matter
    Some of us have used them for decades, some are seeing them for the first time on marketing slides

    Systems Approach Explaining what an API is can be surprisingly difficult.

    It's striking to remember that they have been around for about as long as we've had programming languages, and that while the "API economy" might be a relatively recent term, APIs have been enabling innovation for decades. But how to best describe them to someone for whom application programming interfaces mean little or nothing?

    I like this short video from Martin Casado, embedded below, which starts with the analogy of building cars. In the very early days, car manufacturers were vertically integrated businesses, essentially starting from iron ore and coal to make steel all the way through to producing the parts and then the assembled vehicle. As the business matured and grew in size, car manufacturers were able to buy components built by others, and entire companies could be created around supplying just a single component, such as a spring.

    Continue reading

Biting the hand that feeds IT © 1998–2022