Microsoft unveils beefy custom AMD chip to crunch HPC workloads on Azure

In-house DPU and HSM silicon also shown off

Ignite One of the advantages of being a megacorp is that you can customize the silicon that underpins your infrastructure, as Microsoft is demonstrating at this week's Ignite conference in Chicago.

Redmond is bringing to its Azure cloud platform a custom hardware security module (HSM) and its own data processing unit (DPU), plus an intriguing custom AMD processor to power virtual machine instances targeting high-performance computing (HPC) workloads.

Described as Microsoft's latest advance in CPU-based supercomputing, the Azure HBv5 virtual machine is powered by custom AMD Epyc 9V64H processors. These are based on Zen 4 CPU cores rather than the latest Zen 5 technology, at up to 4 GHz peak frequency.

Unlike most VM instances, which typically share a processor with others, the Azure HBv5 will be spread across four Epyc 9V64H processors, for up to 352 cores and up to 9 GB of memory per core, supporting 6.9 TBps of memory bandwidth across 400-450 GB of HBM3 memory.

Microsoft claims this memory bandwidth is up to 8x that of the latest bare-metal or virtual machine instances available on rival platforms. Hence the firm is pitching HBv5 at the most memory-constrained HPC applications, such as computational fluid dynamics, automotive and aerospace simulation, weather modeling, energy research, molecular dynamics, and computer-aided engineering.

Each instance also gets a 14 TB local NVMe SSD, said to be capable of up to 50 GBps read and 30 GBps write bandwidth, and 800 Gbps of Nvidia Quantum-2 InfiniBand networking.

One intriguing fact Redmond disclosed is that the cluster of custom chips making up each HBv5 instance will have twice the total Infinity Fabric bandwidth between them as "any AMD Epyc server platform to date."

This led some on The Reg systems desk to suspect that the Epyc 9V64H may actually be a version of AMD's MI300A APU chip, but with all CPUs rather than a mix of GPU and CPU cores. We asked Microsoft for more details and will report back if we hear any more.

However, Azure HBv5 instances aren't even available as a technology preview yet. Anyone interested can sign up for access to the preview, which is set to start in the first half of 2025, Microsoft said.

Azure is also getting Microsoft's first in-house DPU, the imaginatively named Azure Boost DPU. As Reg readers will know, this is basically a programmable chip designed to offload network and/or storage processing from the host CPUs in a datacenter server.

This is based on tech that the cloud colossus gained from its acquisition of Fungible last year, and integrates high-speed Ethernet and PCIe interfaces along with network and storage engines, data accelerators, and security features, into a fully programmable system-on-chip.

"Built specifically for the Azure infrastructure, Azure Boost DPU is a hardware-software co-design that runs a custom, lightweight data-flow operating system to enable agile platforms with higher performance, lower power consumption, and enhanced efficiency compared to traditional implementations," said Corporate VP of Silicon Pradeep Sindhu, former co-founder and CEO at Fungible.

Another piece of custom silicon is Azure Integrated HSM. This type of chip is a dedicated hardware security component that performs encryption/decryption, and keeps the associated keys securely stored on the chip itself.

This kind of resource is not new, and cloud platforms, including Azure, already feature them. However, Microsoft says that Azure Integrated HSM eliminates the latency of network round-trips to remote HSM services, or seeking the release of keys from those remote HSMs.

"As a server-local HSM that securely binds to the workload environments, Azure Integrated HSM provides locally attached HSM services to both confidential and general-purpose virtual machines and containers. This provides the benefit of industry-leading in-use key protection without the latency drawbacks of round-trip network-attached HSM calls," explained chief technology officer Mark Russinovich, on a blog announcing the new silicon.

Starting next year, an Azure Integrated HSM will be part of every new server deployed on Azure, Microsoft said. ®

More about

TIP US OFF

Send us news


Other stories you might like