Nvidia talks up local AI with RTX 500, 1000 Ada mobile GPUs

As always, the lack of memory could prove limiting

Nvidia rolled out a pair of entry-level laptop GPUs aimed at professional notebooks on Monday, with performance claims far exceeding that of standalone system on chips (SoCs) from Intel and AMD.

Based on the Ada architecture, which debuted in late 2022, the RTX 500 and 1000 Ada Generation cards boast 154 and 193 teraFLOPS of sparse FP8 performance, which is squeezed from 2,048 to 2,560 CUDA cores and 64 to 80 tensor cores respectively.

As for the entry level card, Nvidia claims this translates into "14x the generative AI performance for models like Stable Diffusion, up to three times faster photo editing with AI and up to 10x the graphics performance for 3D rendering compared with a CPU-only configuration."

But, as usual we recommend taking such claims with a grain of salt.

Nvidia's entry-level RTX 500 and 1000 are its latest mobile processors based on its Ada Lovelace architecture.

Nvidia's entry-level RTX 500 and 1000 are its latest mobile processors based on its Ada Lovelace architecture. - Click to enlarge. Source: Nvidia.

While Nvidia did touch on professional applications like video rendering, it's clearly focused on the emerging AI PC segment, where even its lowest-end GPU's outperform Intel and AMD's Core Ultra and Ryzen 8040-series processors, which top out at 34 and 39 TOPS respectively. And that's when you add up the performance of their CPU, NPU, and GPU.

However, more important to running AI models locally is the adequate supply of fast memory. In this case, customers could find the new cards somewhat confining. The base model RTX 500 Ada comes equipped with 4GB of GDDR6 memory with a peak bandwidth of 128GB/s. Meanwhile the higher-end RTX 1000 Ada jumps up to 6GB of memory capable of hitting 192GB/s.

Nvidia envisions businesses using these cards to "query their internal knowledge base with chatbot-like interfaces using local large language models."

However with 4GB-6GB of memory to work with, that doesn't leave much space for many of the more popular LLMs without resorting to clever tricks, like quantization to shrink the memory size or sacrificing floating precision and dropping down to something like Int4.

To drive home this point, Nvidia's recently unveiled Chat with RTX — an AI chatbot that owners of Ampere and Ada Lovelace cards can download and run locally on their machines — won't run on either of the cards announced today because they lack the necessary memory.

Generally speaking, you need about 1GB of video memory for every billion parameters to run a model at 8-bit floating or integer precision. So for Meta's Llama 2 7B, a popular benchmark for AI PC performance given its relatively small size, would need about about 7GB.

Getting around this limitation would either mean stepping up to the pricier RTX 2000 Ada, or taking advantage of quantized models or lower precision to reduce the model's memory footprint.

But, beyond being able to tinker with chatbot and image-gen models like Llama 2 7B or Stable Diffusion, cases of software capable of harnessing local models is still relatively few and far between, and the software that does exists is predominantly aimed at creatives. This is expected to change as Microsoft rolls more AI features into its Windows operating system, presumably in an effort to lessen the burden on cloud servers.

While Nvidia may have a performance advantage over Intel and AMD's SoCs, the cards do come with some drawbacks - particularly when it comes to power and thermal range. The RTX 500 Ada can be configured for a power envelope of 35W-60W, while the RTX 1000 Ada has the same base TDP but can consume upwards of 140W.

This wide operating range isn't uncommon for notebook accelerators as the chip's performance is often determined more by the ability to cool the system in a given form factor. This also means that Nvidia's ability to hit its claimed performance marks could be hampered by OEMs prioritizing thin notebooks that run as quiet and cool as as possible.

Nvidia tells The Register hitting its 154 and 193 teraFLOPS claims were achieved at their peak TDP.

If you are interested in picking up one of these GPUs, Nvidia expects them to start shipping in OEM notebooks from Dell, HP, Lenovo, MSI and others later this spring. ®

More about

TIP US OFF

Send us news


Other stories you might like