Intel's neuromorphic 'owl brain' swoops into Sandia labs
Hala Point system crams more than a thousand neurochips into a 6U chassis to tackle real-time AI
Intel Labs revealed its largest neuromorphic computer on Wednesday, a 1.15 billion neuron system, which it reckons is roughly analogous to an owl's brain.
But don't worry, Intel hasn't recreated Fallout's Robobrain. Instead of a network of organic neurons and synapses, Intel's Hala Point emulates them all in silicon.
At roughly 20 W, our brains are surprisingly efficient at processing the large quantities of information streaming in from each of senses at any given moment. The field of neuromorphics, of which Intel and IBM have spent the past few years exploring, aims to emulate the brain's network of neurons and synapses to build computers capable of processing information more efficiently than traditional accelerators.
How efficient? According to Intel, its latest system – delivered to Sandia National Labs in the US – is a 6U box roughly the size of a microwave that consumes 2,600 W, and can reportedly achieve deep neural network efficiencies as high as 15 TOPS/W at 8-bit precision. To put that in perspective, Nvidia's most powerful system, the Blackwell-based GB200 NVL72, which has yet to even ship, manages just 6 TOPS/W at INT8, while its current DGX H100 systems can manage about 3.1 TOPS/W.
Researchers at Sandia National Labs take delivery of Intel's 1.15 billion neuron Hala Point neuromorphic computer – click to enlarge
This performance is achieved using 1,152 of Intel's Loihi 2 processors, which are stitched together in a three-dimensional grid for a total of 1.15 billion neurons, 128 billion synapses, 140,544 processing cores, and 2,300 embedded x86 cores that handle the ancillary computations necessary to keep the thing chugging along.
To be clear, those aren't typical x86 cores. "They are very, very simple, small x86 cores. They're not anything like our latest cores or Atom processors," Mike Davies, director of neuromorphic computing at Intel, told The Register.
If Loihi 2 rings a bell, that's because the chip has been knocking around for a while now having made its debut back in 2021 as one of the first chips produced using Intel's 7nm process tech.
Despite its age, Intel says the Loihi-based systems are capable of solving certain AI inference and optimization problems as much as 50x faster than conventional CPU and GPU architectures while consuming 100x less power. Those numbers appear to have been achieved [PDF] by pitting a single Loihi 2 chip to Nvidia's tiny Jetson Orin Nano and a Core i9 i9-7920X CPU.
Don't throw out your GPUs yet
While that might sound impressive, Davies admits that its neuromorphic accelerators aren't ready to replace GPUs for every workload just yet. "This is not a general-purpose AI accelerator by any means," he said.
For one, arguably AI's most popular application, the large language models (LLMs) powering apps like ChatGPT, won't run on Hala Point, at least not yet.
"We're not mapping any LLM to Hala Point at this time. We don't know how to do that. Quite frankly, the neuromorphic research field does not have a neuromorphic version of the transformer," Davies said, noting that there is some interesting research into how that might be achieved.
Having said that, Davies' team has had success running traditional deep neural networks, a multi-layer perceptron, on Hala Point with some caveats.
- RISC-V AI chip upstart Rivos plans to undercut Nvidia, helped by a quarter-billion in VC lucre
- AI cloud startup TensorWave bets AMD can beat Nvidia
- Los Alamos Lab powers up Nvidia-laden Venado supercomputer
- Intel Gaudi's third and final hurrah is an AI accelerator built to best Nvidia's H100
"If you can sparsify the network activity and the conductivity in that network, that's when you can achieve really, really big gains," he said. "What that means is that it has to be processing a continuous input signal … a video stream or an audio stream, something where there's some correlation from sample to sample to sample."
Intel Labs demonstrated Loihi 2's potential for video and audio processing in a paper published [PDF] late last year. In testing they found that the chip achieved significant gains in energy efficiency, latency, and throughput for signal processing, sometimes exceeding three orders of magnitude, compared to conventional architectures. However, the largest gains did come at the expense of lower accuracy.
The ability to process real-time data at low power and latency has made the tech attractive for applications like autonomous vehicles, drones, and robotics.
Another use case that's shown promise is combinatorial optimization problems, like route planning for a delivery vehicle, which has to navigate a busy city center.
These workloads are incredibly complex to solve as small changes like vehicle speed, accidents, and lane closures have to be accounted for on the fly. Conventional computing architectures aren't well suited to this kind of exponential complexity, which is why we've seen so many quantum computing vendors targeting optimization problems.
However, Davies argues that Intel's neuromorphic computing platform is "far more mature than these other experimental research alternatives."
Room to grow
According to Davies, there's also still plenty of headroom to be unlocked. "I'm sad to say it's not fully even exploited to this day because of software limitations," he said of the Loihi 2 chips.
Identifying hardware bottlenecks and software optimizations is part of the reason Intel Labs has deployed the prototype at Sandia.
"Understanding the limitations, especially at the hardware level, is a very important part of getting these systems out there," Davies said. "We can fix the hardware issues, we can improve it, but we need to know what direction to optimize."
This wouldn't be the first time Sandia boffins have gotten their hands on Intel's neuromorphic tech. In a paper published in early 2022, researchers found the tech had potential for HPC and AI. However, those experiments used Intel's first-gen Loihi chips, which have roughly an eighth the neurons (128,000 vs 1 million) of its successor. ®