Micron joins the CXL 2.0 party with a 256GB memory expander
RAM disguised as an SSD? What will they come up with next?
Micron has become the latest chipmaker to announce a compute express link (CXL) 2.0-compliant memory expansion module capable of strapping up to 256GB of DRAM to a spare PCIe x8 interface.
While development on the CXL standard began in earnest in early 2019, it wasn't until recently that compatible CPUs from AMD and Intel were readily available. The chipmakers' respective 4th-gen Epycs and Xeons are the first to support the standard.
The technology defines a cache-coherent interface for connecting CPUs, memory, accelerators, and other peripherals, built atop PCIe. However for the moment, all of the practical applications revolve around attaching DRAM to a PCIe interface the same way you might plug in an NVMe SSD. That's essentially what Micron has announced with its CZ120 memory expansion modules, which actually use the same E3.S 2T form factor found in some datacenter SSDs.
Why you should start paying attention to CXL nowREAD MORE
The modules are available in 128GB and 256GB capacities and utilize a "unique dual-channel memory architecture" capable of delivering 36GB/s of bandwidth.
It's not exactly clear how Micron has managed to do this as a PCIe 5.0 x8 interface is only good for around 32GB/s. We've reached out to the chipmaker for comment, but if we had to guess, the chipmaker is quoting the memory bandwidth delivered by the DRAM to the CXL controller, not the host system.
Who are these for?
Micron sees two key use cases for these modules. The first involves the ability to add "incremental" memory capacity to support growing workloads. If, for example, you've maxed out the memory of your CPU, slotting in a few CXL modules is one way to add capacity to an existing system, without migrating to a multi-socket platform. Using eight CZ120 modules, Micron says users can extend their system memory by up to 2TB.
The second use case involves memory bandwidth constrained workloads — a common bottleneck in high-performance compute applications. If you've maxed out the memory bandwidth of your system, CXL memory modules offer a way to stretch beyond that. In the case of Micron's CZ120, each module is good for an additional 36GB/s of memory bandwidth. That means those eight modules, from the previous example, are also good for an additional 256GB/s of bandwidth.
While attaching memory over PCIe might sound like a bad idea if you care about latency — there's a reason memory DIMMS are crammed as close to the socket as possible — it's not as big a problem as you might think. In fact, we're told the latency is roughly equivalent to a NUMA hop in a dual socket system.
- Astera Labs says its CXL tech can stick DDR5 into PCIe slots
- Non-binary DDR5 is finally coming to save your wallet
- How AMD, Intel, Nvidia are keeping their cores from starving
- Why you should start paying attention to CXL now
One use case that Micron didn't discuss on its release, but in theory should be supported by the modules is memory pooling. The functionality was introduced in version 2.0 of the spec and allows for CXL memory modules to be accessed by multiple processors on the host. This means that if you've got a multi-socket system or a CXL compatible accelerator, these processors should be able to take advantage of this memory as well.
Micron's CZ120 is sampling to customers now. However, it's worth noting that Micron is hardly the first to announce a CXL 2.0-compatible product. Samsung announced a 128GB E3.S form factor CXL memory module back in May. Meanwhile, Astera Labs, has been talking up its Leo-P series chips since last year.
CXL's next step
While many of the early applications around CXL center around memory, the interconnect is far more flexible. With the introduction of the CXL 3.0 spec last year, the standard gained support for switch fabrics, which opened the door to fully disaggregated architectures.
Instead of just pooling memory between a couple host CPUs, all manner of devices, whether they be storage, networking, or GPUs can be accessed over the CXL fabric. In other words, rather than limiting access to a bank of GPUs to a single server, you could instead have one CXL-enabled GPU appliance, which is accessible to multiple clients at any given time.
The idea is really no different from connecting a bunch of network switches so that clients on each side of the network can talk to systems on the other, except instead of TCP and UDP over Ethernet, we're talking about CXL running over PCIe.
However, this vision remains a few years off. Chipmakers have only begun rolling out support for CXL 1.1 — though some do support CXL 2.0 for memory devices already, like AMD's Epyc. It'll take some time before the full feature set of CXL reaches maturity. ®