ARM's big.LITTLE now big.LITTLE.fat.SKINNY: CPU designer makes room for accelerators
AI. Check. Machine learning. Check. Brand name. Check
ARM is today touting a new way of organizing processor chips – one that will squeeze accelerators designed for AI and such tasks into phones, PCs, cars, and so on.
This layout, dubbed DynamIQ, builds on the Brit processor designer's big.LITTLE architecture that's been around since 2011. Big.LITTLE works by hooking a set of lightweight ARM cores to a set of beefier and more energy-sucking cores, so that when not much processing power is needed, the collection of fatter cores can be powered down to save battery life. When an app or game needs some extra oomph, it can spin up the meaty CPUs for a while. The key thing here is that you have a bunch of cores that are one size, and another bunch that are another. Hence the name, big and little.
DynamIQ expands on that approach by letting chip architects bung all sorts of cores into the same system-on-chip, along with an upgraded internal memory bus to cope with the data flowing between the processing units. These cores can be big, medium, curvy, and little. Alongside these, on the same chip die, you can place accelerators that perform machine-learning tasks in silicon – most likely inference work from trained models. This is, essentially, plugging hardware acceleration right into the internal highway between the general ARM compute cores, allowing them to share information and memory directly.
It's up to the individual system-on-chip designers to pick and choose the cores they want, and then fit them together using DynamIQ; these designers license the architecture from ARM as per usual, configure it as needed, and lay it on their silicon. DynamIQ can juggle up to eight heterogenous cores at once in a single on-chip cluster, and each core can have different power requirements and performance output. TrustZone is still supported by DynamIQ.
Blueprints for new ARM Cortex-A family cores will emerge later this year that are DynamIQ-compatible, we're told, and we guess we'll see these in devices in 2018. ARM tells us some of its licensees already have their hands on the DynamIQ designs. Existing Cortex cores are not DynamIQ-ready, by the way: if you're a system-on-chip designer and you want to use DynamIQ, you'll have to license one of the new CPUs.
Along with DynamIQ, ARM will later this year reveal new CPU instructions and software libraries to speed up artificial intelligence software on its next-generation cores. It may sound as though ARM is getting into the AI accelerator business: what's really happening here is that it will tout DynamIQ-enabled cores that have instruction set extensions to perform machine-learning operations in hardware. Separately, system-on-chip designers can provide their own specialized accelerator cores that plug into DynamIQ.
We just hope people have learned from the Samsung Exynos 8890 debacle, which mixed different cache line lengths in its multicore ARM-compatible processor design, causing applications to crash. It's all well and good having a bunch of different CPUs in your system-on-chip; just don't make complex software even more complex as a result of a more complex architecture. Complex software is buggy software, and normal folk don't care about fancy memory subsystem enhancements and CPU networking – they care about apps not working. ®