Ampere said today it hopes to bring a 128-core data-center-grade microprocessor to the market next year.
Both microchips are, or will be, fabricated by TSMC using its 7nm process node, and both use a coherent mesh to arrange their CPU cores on the die. They are both socket compatible, and both feature up to 128 PCIe 4 lanes and eight channels for DDR4-3200 RAM per socket. Both have one thread per core.
Samples of the Altra Max will, it is hoped, roll off the assembly lines in the final quarter this year, and be sold at volume in 2021 primarily to cloud providers. A 5nm test chip, featuring some Altra blueprints, has been taped out, though a product on that process node is not due to enter the market until 2022.
Execs at Ampere were keen to stress that the Altra series has an emphasis on predictable and consistent performance: if you want to run all 80 or 128 cores at 2 or 3GHz all the time, you can. There's no dynamic shifting of speed, unless you want that, meaning cloud customers can be guaranteed and delivered fixed or deterministic levels of performance.
Fujitsu, Japan strong-Arm their way to the top with world's fastest-known super: 415-PFLOPS FugakuREAD MORE
A top-end 80-core Altra chip clocked at 3.3GHz has a maximum TDP of 250W. We're promised that a 128-core Altra Max clocked about the same frequency, around 3GHz, will also have a max TDP of 250W. That's more cores and performance within the same power envelope, mainly due to improvements in the design and that the Altra series was designed with a 40W headroom to expand into, according to Ampere.
The 128-core part has a larger single die than its 80-core sibling; Ampere ruled out for the time being using multi-die chip packages due to the latency between dies. The 5nm cousin, if or when it arrives, will sport more CPU cores, as well as increased IO and memory bandwidth to service them.
One thing to bear in mind with these high-core-count processors is the potential issue of resource contention. Software, from applications down to kernels and drivers, may have locks and other mutually exclusive resources that can be acquired and released many times a second without much contention to slow them down – when using eight, 16, 32, or even 64 cores. When you start getting up to 128, or higher, unexpected bottlenecks may start to appear – and not just in locks, IO, and memory, but also in the caches and in the mesh interconnect.
Ampere's Jeff Wittich, senior veep of products, told us his software team has been "knocking down scheduling barriers," allowing, for example, Kubernetes to efficiently run containers across all available CPU cores on an Altra processor. "It runs real-life operating systems, this is not an academic system no one's going to deploy," he said.
For what it's worth, Cloudflare says it evaluated the Altra and wants to try out the Max, too; Genymobile has been using Ampere's silicon to provide virtual Android devices in the cloud to enterprise app developers; and Scaleway hopes to evaluate the Altra family for its cloud platform later this year. Microsoft, Packet, and Oracle have also shown an interest as Ampere – backed by Oracle and others – continues to tout its wares across America, Europe, and China.
Your humble vulture thinks it is fair to say that, generally speaking in the land of data centers, buyers tend to wait until at least the second generation of a chip series arrives before committing to a family. Ampere did offer an eMag microprocessor in 2018, though that had 32 cores and used X-Gene blueprints acquired from Applied Micro. With the Arm Neoverse-based Altra announced a few of months ago, and said to be shipping this year, and the Max now promised for next year, we imagine Ampere wants to hurry everyone along to the part where they place significant orders.
Ampere is also playing a game of leapfrog with Marvell, which, soon after the announcement of the 80-core Altra, teased the world with a glimpse of its 7nm 96-core, 384-thread ThunderX3. ®