AI PC hype seems to be making PCs better – in hardware terms, at least
16GB of RAM should be enough for anyone that wants to run models locally. GPUs, NPUs and more kit will be needed, too
Comment What is an AI PC? What is its killer app? Both are open questions , although we do know that running AI models locally – not over the network like when you're playing with ChatGPT – requires significant grunt on your desk or lap.
It seems certain that to run AI models well, PCs must either become more powerful or AI models must become more frugal.
With the first batch of so-called AI PCs from Intel and AMD trickling onto the market this year, compact large language models like Llama 2 7B or Stable Diffusion are often referenced. But, while small compared to models like GPT-4, they're still fairly demanding for a typical notebook PC.
AI rising tide lifts all PCs
For customers, the resource-intensive nature of localized AI – as we've seen with past leaps in software requirements – will ultimately prove a good thing.
The launch of Windows 7 in 2009 springs to mind as an example of how software forced the PC market to evolve – eventually. Windows Vista, released three years before, had huge system requirements compared to its predecessor Windows XP. The latter ran very well on 512MB of RAM, but that was barest minimum for Vista. It's one reason why so few netbooks ever ran the much maligned OS, with many OEMs opting to stick with XP.
When Windows 7 came along and offered a better reason to upgrade, both OEMs and customers did so quite happily despite hefty hardware requirements. Its capabilities were sufficiently compelling that it was worth investing in the hardware to take advantage.
AI is touted as having similar market-making powers.
For high or even mid-range notebooks with current-gen graphics, and/or integrated neural processing unit (NPUs), this really won't be a problem – at least not computationally. But when it comes to memory, 8GB just isn't going to cut it anymore. It might be fine for one AI app, but is nowhere near enough to run larger models, or multiple smaller ones.
Even with four-bit quantization, Llama 2 7B is going to require around 3.5GB of fast memory and a fairly beefy GPU and/or NPU to make it an enjoyable experience. So clearly, the minimum spec for a PC will need to become more powerful.
According to TrendForce, that's exactly what's happening. In a recent report, the group claimed Microsoft will define AI PCs as having 16GB of RAM and 40 tera-operations per second (TOPS) of NPU inferencing performance.
For reference, the NPUs in Intel, AMD, and Apple's latest notebook chips are capable of pushing 16-18 TOPS. So there's still some progress to be made, if TrendForce's claims are accurate.
However, 40 TOPS is right around what you can expect from Intel and AMD's latest thin-and-lite processor families – which combine CPU, GPU, and NPU resources.
For bigger models, like Llama 2 and Stable Diffusion, it's our understanding that these are still going to run predominantly on the GPU, which is responsible for the majority of PCs' AI grunt these days anyway.
NPUs are designed to accelerate small machine learning tasks on devices – things like facial and object detection in the photos app, optical character recognition, or subject selection. These models are small enough that they can run on an NPU without stressing the CPU and GPU, or draining your battery too badly.
- Microsoft prices new Copilots for individuals and small biz vastly higher than M365 alone
- Nvidia gives RTX 40 series a Super refresh as AI PC hype takes off
- Avoiding AI-capable PCs will be impossible by 2027
- 2024 sure looks like an exciting year for datacenter silicon
GPU vendors take notice
Speaking of GPUs, it's not just notebook PCs that will enjoy a performance boost from all this AI hype. In addition to its NPU-toting desktop APUs, AMD previewed a new entry-level GPU with 16GB of GDDR6 memory at CES earlier this month.
The RX 7600 XT is nearly identical to the non-XT variant we looked at last year, with the main difference being more memory and ever so slightly higher clock speeds. Why? Well, on top of supporting more demanding games, more video memory allows the card to support bigger AI models without resorting to quantization.
With 16GB of vRAM you can easily run a 7B parameter model running at half precision (FP16) or run a larger model, at Int8. In fact, AMD touted this capability in its press deck ahead of the launch.
It's worth noting that AMD only recently announced support for its ROCm AI framework on RDNA graphics cards, like its 7000-series parts. Alongside its MI300 APUs and GPUs, launched in December, the chip designer also introduced the Ryzen AI software suite to help developers build machine learning apps that can dynamically tap into CPU, NPU, and GPU resources.
Nvidia has also tweaked its 40-series lineup with a few AI-friendly improvements. At CES this month the AI chip king unveiled the RTX 4070 TI Super with 16GB of vRAM - up from 12GB on the earlier non-super variant. The chip also boasts a 256-bit memory bus. So in addition to being able to run larger models, the higher bandwidth should speed AI response times.
Nvidia has been particularly outspoken about its vision for in-game AI to make interactions with non-player characters less of a cookie cutter experience. As game developers begin implementing the tech, we expect to see system requirements – and ultimately GPU specs – creep upwards. Whether Nvidia or AMD will jack GPU prices to cover this remains to be seen.
It's clear we're still in the "build it and hope they come" phase of the AI PC's evolution. Getting a critical mass of AI-capable hardware in customers' hands will be necessary before software developers will train and integrate LLMs and other machine-learning algorithms into their apps.
To make the economies of scale make any sense, it's necessary to set a minimum acceptable level of performance that the largest number of users find affordable. Ultimately, this means even the specs of entry-level systems will change to handle AI workloads.
As adoption of AI PCs and the availability of optimized software increases, we expect new use cases to emerge, models to grow larger, and system requirements to trend ever upward. ®