GPU cores get some love, as well
As mentioned above, Kaveri's GPU cores are based on AMD's GCN architecture, first unveiled in July 2011 and now extending throughout the full range of AMD's offerings. "Now every product in the AMD portfolio," Lansing said, "from the 2-watt tablet to the multiple-hundreds-of-watts discrete graphics to all the game consoles are now unified on the GCN architecture" – which, by the way, supports AMD's low-level Mantle API for juicing gaming performance, as well as DirectX 11.2.
Macri said that the mobile Kaveri's designers allocated nearly half of its entire 245mm2, 2.41 billion–transistor, 28nm die to graphics and other accelerators – the other 53 per cent are filled by CPU cores, caches, I/O, power management, and other housekeeping stuff that chipsters often call the "uncore" – for one simple reason. "It's not about a spreadsheet or text or simple things like that anymore," he said. "Now it's all about visualization."
Data analysis has moved beyond numbers into shapes, graphs, 3D, and other visualization methods, he said, and a beefy graphics and multimedia subsystem is needed to keep up with both the parallelized crunching of that data and the presentation of the resulting analysis. "We all work with our eyes," Macri said. "One picture is worth more than a thousand words."
Well, there's that – but there's also gaming. According to Macri, stats from Steam show that 35 per cent of their gamers have rigs that are less powerful that the GPU in the mobile Kaveri.
Kaveri's GPU is essentially a version of AMD's "Hawaii" GCN cores – part of their "Volcanic Islands" series, successor to "Sea Islands" and precursor to "Pirate Islands", which should begin appearing next year. There are, however, a few changes.
The two big differences, Macri said, are coherency and context-switching – both key elements of HSA, for which Hawaii, being strictly a graphics architecture, had no need.
The are eight graphics compute units in the top-of-the-line Kaveri part, each with 512 IEEE 2008–compliant floating point–capable shaders, as well as a flat address space, which Macri characterized as "absolutely key." Some precision improvements have been added, as well.
The new Kaveri line includes parts with as few as three or as many as eight GPU cores (click to enlarge)
A 64KB local data share minimizes the off-die needs of the GPU, improving power efficiency. "This is a big performance-per-watt improvement of the graphics back end," said Macris.
One elegant feature of the HSA-capability of the mobile Kaveri's GPU is the fact that the eight "compute units" that comprise it are all asynchronous. They're all able to go off and do whatever the hell they want to do – or are told to do – whenever the hell they want without needing to consult with their brethren.
"They can each run their own set of tasks," Macri said. "They work off a set of dispatch queues – each one can manage up to eight queues – so they can basically be working with different pieces of different threads."
The addition of fast context-switching – one of the GPU cores' upgrades from Hawaii – is only employed when the GPU is performing a compute task, Macri emphasized. "We haven't applied context-switching to 3D graphics yet," he said, noting that the state of a 3D process currently occupies most if not all of a graphics engine. "It's very big," Macri said. "We're working on how to make that work, but we're not here to talk about that today."