IBM says these back-office, network edge Power 10 servers would be sweet for – yes, you guessed it – AI
Short on cores, big on threads and matrix math
Not to be left out of the AI infrastructure game, on Tuesday IBM unveiled a pair of tiny Power 10 servers designed to preprocess data at the network edge.
The Power S1012 systems are available in both a PC-style tower configuration and a more traditional 2U half-width rack mount chassis. Both can be equipped with IBM's homegrown Power 10 processor with one, four, or eight cores enabled and up to 256GB of onboard memory.
While that might not sound like a lot of cores next to Intel and AMD's edge-centric chips, which can be had with up to 64 cores, it's worth noting that IBM's Power platform is based on a RISC architecture that prioritizes highly-threaded workloads with support for both SMT4 or SMT8.
That means the Power 10 eSCM modules used in these systems can support up to eight threads per core, which on the top-specced configuration works out to a more respectable 64 threads.
IBM boasts its new servers are up to three times more powerful than the outgoing Power S814, which may sound impressive until you consider that system is based on the 10-year-old Power 8 platform. Having said that, the Power 10 family isn't exactly all that fresh anymore either, with it due to celebrate its third birthday in September.
IBM envisions these systems being deployed in a number of scenarios including for AI inferencing in space or power constrained edge deployments, or for running more traditional workloads in remote or back office scenarios.
The chief argument appears to be that by processing all the data streaming in from the edge in place rather than shuttling it all back to a central datacenter, customers can reduce latencies and curb bandwidth consumption.
By all appearances IBM seems to be targeting existing Power customers familiar with the particular hardware and software nuances associated with the SMT-heavy architecture. One those customers is analytics wrangler Equitus, which IBM says is already using the systems to run its AI models at the edge.
- Warren Buffett voices AI fears, likens tech to atom bomb
- Intel, Ampere show running LLMs on CPUs isn't as crazy as it sounds
- More than a third of enterprise datacenters expect to deploy liquid cooling by 2026
- Next-gen Meta AI chip serves up ads while sipping power
As for how IBM is going about processing those AI workloads differs considerably from what you might expect. From what we can tell, these systems aren't equipped with GPUs — IBM's announcement makes no reference to them. Instead, IBM appears to be leaning on the processors' matrix math accelerators (MMAs), four of which are baked into each core, to do the heavy lifting.
In many respects, these MMAs are reminiscent of the AMX engines which appeared in Intel's 4th and 5th-gen Xeon Scalable platforms from 2023. And as we've recently explored, those engines are more than capable of running small large language models ranging between 7 to 13 billion parameters in size.
Alongside its MMAs, IBM also highlighted support for transparent memory encryption to safeguard data moving in and out of AI models on the device. Considering the fact these systems are likely to deployed in remote locations with limited security or supervision, this is likely a welcome feature - particularly for those in highly regulated industries.
The S1012 systems will be available for purchase beginning June 14. ®
PS: IBM just released a family of code-generating models to the open source world, saying: "The aim is to make coding as easy as possible — for as many developers as possible."