Hailo's latest AI chip shows up integrated NPUs and sips power like fine wine

All your PC needs for 40 TOPS is an M.2 slot

Today, users who want to interface with AI usually do so through a cloud-based service like ChatGPT or Microsoft Copilot, rather than locally.

Part of the reason for this is that there are just not many great options for running AI and large language models (LLMs) on end-user hardware, although we did break down some ways to do this a few weeks ago. As of now, there aren't many CPUs with integrated neural processing units (NPUs), unless you're looking at the latest laptop CPUs from Intel, AMD, and Qualcomm, or for desktop, the Ryzen 8000 series.

Lacking an NPU means users will have to run AI workloads on graphics devices, but this isn't perfect either. Often, only the graphics on Intel's and AMD's latest laptop CPUs are sufficient, and the only other option is a dedicated graphics card, which are expensive and draw lots of power.

However, add-in AI accelerators could become an appealing alternative, and Hailo is making its case with the launch of the Hailo-10 AI processor. Hailo promises to bring AI to a wide range of PCs and other devices locally, taking LLMs out of the cloud and to the edge.

The AI chip that can run on an M.2 stick

Hailo-10 is somewhere in the middle when it comes to performance among competing NPUs. It's rated to deliver 40 TOPS of INT4 performance, equivalent to 20 TOPS of INT8. For comparison, Intel's Core Ultra Meteor Lake NPU is capable of 11 TOPS at INT8, and AMD's XDNA processor in the Ryzen 8040 Hawk Point lineup can go up to 16 TOPS. That's a sizeable performance advantage over the two PC chipmaking titans.

While the Hailo-10 shows promise, upcoming chips are poised to surpass it. Intel claims its upcoming Lunar Lake chips have an NPU that clocks in at 45 TOPS, and although it's not clear if this is INT4 or INT8 performance, either way it would beat the Hailo-10. Similarly, Qualcomm's Snapdragon X Elite has 45 TOPS of INT8, more than double that of Hailo's new chip.

Performance isn't everything, however, and Hailo has two tricks up its sleeve, one of which is power consumption. "Hailo-10 is faster and more energy efficient than integrated neural processing unit solutions," Hailo CTO Avi Baum told The Register. He added that "a separate NPU is advantageous" over integrated NPUs thanks to lower power consumption, which means more battery life and less heat.

The company claims that Hailo-10 operates at less than five watts, and the first member of the family, the Hailo-10H, has a typical power consumption of less than 3.5 watts. Hailo claims this is half the power Intel's Meteor Lake NPU requires, and with roughly double the performance, the Hailo-10 is four times more efficient.

Getting these chips into PCs is the first step. Hailo has opted for the compact M.2-2242 form factor, a common interface for storage and expansion cards, to integrate the Hailo-10H into PCs. M.2 slots usually take NVMe SSDs, but can be used for other devices including AI accelerators. M.2-powered AI processors are nothing new; both Hailo and companies like Google have made them before. The Hailo-10's relatively high performance does make it stand out, though.

"The ability to have an accelerator separately from the main processing unit allows to add AI capabilities to a range of platforms that are not equipped with integrated NPUs," said Baum, noting that many high-performance CPUs today don't have NPUs at all, such as desktop, workstation, and server chips from AMD, Intel, and others.

Even for chips that already have integrated NPUs, installing a separate AI accelerator can still make sense, Baum said. "As this is a fast-moving domain, the ability to further boost the more capable platforms is also relevant for the high-end platforms with integrated NPUs." After all, Meteor Lake's 11 TOPS NPU is already outclassed by the Hailo-10, which would be a big upgrade.

However, a potential drawback with using an M.2 slot for the Hailo-10H (and future members of the Hailo-10 family) is that lots of PCs don't have many. There are plenty of laptops that only have two, one for an SSD and the other for a Wi-Fi chip. For many current devices, adding in a Hailo-10H or any M.2-based accelerator would be impossible.

Hailo-10 is already garnering interest

Things look more positive for Hailo when it comes to future devices made with the Hailo-10 in mind. "We see a lot of potential for local execution of generative AI and LLMs in personal computers and car infotainment systems," Baum said. "We are already working with leading OEMs in these markets for implementation of Hailo-10 into their devices."

Baum didn't mention who these OEMs were, but at least one PC manufacturer is interested in using add-in AI accelerators for its PCs. At the last CES, Lenovo showed off its ThinkCentre Neo Ultra, which the company says will utilize a separate AI chip to accompany its NPU-lacking Core i9 and RTX 4060 graphics card. Neither of the two M.2-based AI processors Lenovo demonstrated were made by Hailo, but it certainly shows that there's a market for the Hailo-10H.

Notably, PCs that normally wouldn't be able to meet Microsoft's definition of being an AI PC can technically do so with the Hailo-10H, which has the minimum 40 TOPS Microsoft asks for. By calculating its TOPS in INT4 rather than INT8, Hailo does trade away some accuracy, but for consumer PCs this can be acceptable, especially since INT4 requires less RAM than INT8, which uses 1 GB per billion parameters.

"We were aiming to reach a high enough TOPS capacity to support running LLMs and GenAI on the edge without increasing power consumption and cost," Baum said of meeting Microsoft's AI PC requirement. "This is not accidental that this is more or less where the rest of the industry lands."

Although PCs are a primary focus for the Hailo-10, it's apparently getting wider attention from other markets. "In recent months we are being approached by manufacturers from a very wide range of industries including retail, medical devices, security, and others," said Baum. Smartphones, however, don't seem to be on the table for Hailo at the moment.

Availability and pricing for the Hailo-10H, currently the only member of the Hailo-10 series, hasn't been disclosed yet. For reference, the previous Hailo-8 M.2 accelerator launched in 2020 and went for $179, so we can probably expect a price tag in the triple digits for the Hailo-10 as well. That's not cheap, but buying a PC or a CPU with an integrated NPU is probably going to be much more expensive. ®

