AI + ML

This article is more than 1 year old

Nvidia shrinks GPUs to help squeeze AI into your data center, make its VMware friendship work

Creates two new mini models because it’s assumed you won’t build silos to host huge hot monsters

Mon 12 Apr 2021 // 17:06 UTC

GTC Nvidia has created a pair of small data-center-friendly GPUs because it doesn’t think customers will get into AI acceleration unless they can use the servers they already operate.

The new models – the A10 and A30 – require one and two full-height full-length PCIe slots, respectively. Both employ the Ampere architecture Nvidia uses on its other graphics processors. But both are rather smaller than the company’s other GPUs, and that matters in the context of the recently launched AI Enterprise bundle that Nvidia packages exclusively on VMware’s vSphere.

Before VMware got excited about private and hybrid clouds, it was all-in on server consolidation: turning your server fleet into a logical pool of resources instead of tightly coupling servers to specific applications.

Manuvir Das, Nvidia’s head of enterprise computing, told The Register that Nvidia has figured out that in practice big, hot GPUs end up in dedicated hardware that more-or-less tightly couples AI to dedicated servers. Das and Nvidia would rather you run AI Enterprise on the hardware you already have and/or the hardware you buy most of – which is 1U and 2U servers.

Hence the need for smaller GPUs, both to fit into common servers and in recognition that on-prem data centres are under pressure to be densely packed. Most big server-makers (Dell, Lenovo, H3C, Inspur, QCT and Supermicro) are aboard with the A10 and A30, and Nvidia swears its EGX containerised ML platform will be right at home in this environment.

The A10 GPU. Click to enlarge

Das said the new GPUs will add $2,000 or $3,000 to the cost of a $12,000 server. Nvidia offered the following specs for the A10:

TDP: 150W
FP32: 31.2 teraFLOPS
BFLOAT16 Tensor Core: 125 teraFLOPS | 250 teraFLOPS*
FP16 Tensor Core: 125 teraFLOPS | 250 teraFLOPS*
INT8 Tensor Core: 250 TOPS | 500 TOPS*
GPU bandwidth: 600GB/s
GPU memory: 24GB GDDR6

Here are specs for the A30:

TDP: 165W
FP32: 10.3 teraFLOPS
BFLOAT16 Tensor Core: 165 teraFLOPS | 330 teraFLOPS*
FP16 Tensor Core: 165 teraFLOPS | 330 teraFLOPS*
INT8 Tensor Core: 330 TOPS | 661 TOPS*
GPU bandwidth: 933GB/s
GPU memory: 24GB HBM2 (on-die)

*With Sparsity

Nvidia said the A10 will be available from this month, and the A30 would be available later this year; bear in mind Nvidia’s wares are often hard to find in this ongoing global silicon drought. ®

In other GTC news... Today's the start of Nvidia's 2021 GPU Technology Conference, and as such it has a flurry of things to announce besides the A10 and A30. Here's a couple of highlights:

Nvidia has designed an Arm-based server-grade processor called Grace, which will use a set of future Arm Neoverse CPU cores. It's aimed at supercomputers and massive AI workloads.
It also talked up its BlueField 3 DPU which is designed for so-called smart NICs that accelerate software-defined networking, storage and security functions in hardware away from a machine's host processors.

Topics

Special Features

Vendor Voice

Resources

AI + ML

Nvidia shrinks GPUs to help squeeze AI into your data center, make its VMware friendship work

Creates two new mini models because it’s assumed you won’t build silos to host huge hot monsters

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Intel Gaudi's third and final hurrah is an AI accelerator built to best Nvidia's H100

AI cloud startup TensorWave bets AMD can beat Nvidia

AI bubble or not, Nvidia is betting everything on a GPU-accelerated future

Reducing the cloud security overhead

Lambda borrows half a billion bucks to grow its GPU cloud

What Nvidia's Blackwell efficiency gains mean for DC operators

VMware's end-user compute products probably have a new brand: Omnissa

Overclocking muddies waters for Nvidia's redesigned RTX 4090 and US sanctions

Los Alamos Lab powers up Nvidia-laden Venado supercomputer

China scientists talk of powering hypersonic weapon with cheap Nvidia chip

Broadcom throws VMware customers on perpetual licenses a lifeline

Dell shaves months off lead times for GPU-powered AI servers

About Us

Our Websites

Your Privacy