Arm is aiming two new processing unit designs at slimline AI workloads in smart speakers and other Internet-of-Things devices.
The more powerful of the two, known as Cortex-M55, is a general microcontroller-grade CPU blueprint, while the other, named Ethos-U55, is essentially an AI accelerator. The Cortex-M55 is based on Arm's Helium technology and can perform vector calculations among other things. The Ethos-U55, on the other hand, is a novel architecture for Arm and has been described as a micro Neural Processing Unit, or micro-NPU for short.
Both processors are available now to license, and are intended to be used together, Thomas Ensergueix, senior director of the IoT & Embedded team at Arm, told The Register. The M55 running application code, the U55 doing all the neural-network mathematics in fast hardware. "The microNPU cannot be used on its own; it needs to be paired with a CPU like the Cortex-M55," he said. "Together, this system delivers 480X the performance compared to previous Cortex-M generations working on their own."
Arm only licenses the IP for its cores: to build a chip based on these processor cores, you must design your chip around Arm's technology, drop in Arm's IP, verify it all works as intended, and then send it off for manufacture by someone like TSMC or UMC. The Cortex-M55, paired with the Ethos-U55, is expected to run neural-network inference algorithms in small, low-power or embedded devices, allowing these gadgets to make predictions and decisions right where they are using AI without having to rely on a more powerful machine, say, in the cloud.
To fit this all into a small memory and silicon footprint, the microNPU can decompress trained INT8 models on the fly for inference. The architecture is thus suited for so-called "endpoint AI" applications, such as speech recognition or gesture detection in smart speakers and lights.
More complicated models that need to process data at higher precision, for things like facial or object recognition, will need beefier machine-learning accelerators like Arm's Ethos-N77 processor.
Deep-learning systems destined for these low-end microNPU-powered devices can be developed in any framework as long as it is eventually exported as a TensorFlow Lite or PyTorch Mobile model.
"Enabling AI everywhere requires device makers and developers to deliver machine learning locally on billions and ultimately trillions of devices," said Dipti Vachani, senior vice president and general manager, Automotive and IoT Line of Business, at Arm. "With these additions to our AI platform, no device is left behind as on-device ML on the tiniest devices will be the new normal, unleashing the potential of AI securely across a vast range of life-changing applications."
Folks can expect silicon using these blueprints to arrive early next year. ®