Why you should start paying attention to CXL now

The next server you buy will support it, but what's it good for?


After more than three years of development, compute express link (CXL) is nearly here. The long-awaited interconnect tech will make its debut alongside Intel's upcoming Sapphire Rapids and AMD's Genoa processor families.

This means there's a good chance the next server you buy will support the emerging interconnect tech. So what is it good for?

For now in its 1.1 iteration, the CXL conversation centers on memory expansion and tiered memory applications. Need more RAM than you've got DIMM slots? Just pop a CXL memory module into an empty PCIe 5.0 slot, and you're off to the races.

Yes, it'll be lower performance and introduce a little latency, but if you're memory constrained and Samsung's upcoming 512GB DDR5 DIMMs aren't in your budget, it might be worth considering, especially now that Intel Optane is dead.

Data being the new oil and memory still among the most expensive components in the datacenter — likely more so since your shiny new CXL-compatible system will also be sporting DDR5 — these capabilities alone make CXL attractive in light of the ever-expanding scope of AI/ML, big data, and database workloads.

"If you're bandwidth restricted rather than latency restricted that may be a good trade off," Gartner analyst Tony Harvey tells The Register.

What's more, because each expansion module has its own memory controller, there's really no upward limit to how much DRAM you can add to a system. It doesn't even have to be the same kind of memory. For example, as a cost-saving measure, you could attach a modest amount of DDR5 directly to the CPU and use a slower, albeit cheaper DDR4 CXL memory-expansion module as part of a tiered-memory hierarchy.

These kinds of memory modules are already on the way. Marvell, which detailed its CXL roadmap this spring, is expected to launch its first line of CXL memory modules alongside the Sapphire Rapids and Genoa launch. Likewise, Samsung has a 512GB CXL DRAM module in production awaiting compatible systems to deploy them in.

Really, the only limiting factor is going to be bandwidth — 32 gigatransfers/sec, the same as PCIe 5.0 — and latency.

But CXL is about more than adding memory using a PCIe slot. The technology defines a common, cache-coherent interface for connecting any number of CPUs, memory, accelerators, and other peripherals.

Memory at a distance

Things will start to get really interesting when the first CXL 2.0-compatible systems start hitting the market.

The 2.0 spec introduces switching functionality similar PCIe switching, but because CXL supports direct memory access by the CPU, you'll not only be able to deploy it at a distance, but enable multiple systems to take advantage of it in what's called memory pooling.

"CXL 2.0 allows a switch, and not only a switch for fan out, but a switch to allow memory devices to segment themselves into multiple pieces and provide access to different hosts," CXL President Siamak Tavallaei told The Register.

Imagine deploying a standalone memory appliance packed with terabytes of inexpensive DDR4 that can be accessed by multiple systems simultaneously, much in the same way you might have multiple systems connected to a storage array.

In this arrangement, memory can be allocated to any machine in the rack, and idle resources are no longer locked away out of reach in a standalone server.

"That's huge, because previously memory was physically tied to the CPU and you couldn't move it around, and that's causing problems because your core-to-bandwidth ratio is all wrong," Harvey said.

If this sounds too good to be true, just look at any of the boutique composable infrastructure vendors — Liqid and GigaIO spring to mind — which have been doing everything short of this, including making dedicated GPU and NVMe storage appliances.

CXL switches do the same thing but extend this functionality to memory.

"Certainly for the bare metal-as-a-service providers, the cloud providers, the ability to take memory, which is probably one of the most expensive components, a get better utilization out of it is going to be huge," Harvey said.

The disaggregated dream

So far, we've mostly covered how CXL will benefit memory-intensive workloads, and eventually provide greater flexibility for how and by whom that memory can be accessed. However, CXL has implications for other peripherals, like GPUs, DPUs, NICs, and other accelerators.

The third wave of CXL appliances is where things will get really interesting, and the way we think about building systems and datacenters could change dramatically.

Instead of buying whole servers, each packed with everything they might need, alongside a couple of CXL memory appliances, the CXL 3.0 spec announced this week will open the door to a truly disaggregated compute architecture where  memory, storage, networking, and other accelerators can be pooled and addressed dynamically by multiple hosts and accelerators.

This is possible by stitching together multiple CXL switches into a fabric. The idea here is really no different than interconnecting a bunch of network switches so that clients on one side of the network can efficiently talk to systems on the other. But instead of TCP and UDP over Ethernet, we're talking CXL running over PCIe.

"That creates a much larger ensemble of systems that you might start calling a fabric," Tavallaei said

Getting to this point wasn't easy however. The switching functionality necessary to achieve this was only hammered out in the latest release. Previously, the 2.0 spec only allowed for a single accelerator to be attached to any given CXL switch, Tavallaei explained.

The 3.0 spec also provides means for direct peer-to-peer communications over that switch or even across the fabric. This means peripherals — say two GPUs or a GPU and memory-expansion module — could theoretically talk to one another without the host CPU's involvement.

This eliminates the CPU as a potential chokepoint, Tavallaei said.

Finally, third-gen CXL systems will gain support for memory sharing, where multiple systems will be able to access the same bits and bytes stored in a common memory pool simultaneously.

And according to Tavallaei, this can be achieved with minimal latency penalty. In the case of memory sharing, he claims the technology can achieve RDMA-like functionality at a fraction of the latency —  hundreds of nano-seconds versus a microsecond or two.

The time to think about CXL is now

While this grand vision of disaggregated compute and composable infrastructure is still a few years off, that doesn't mean you shouldn't be thinking about CXL now.

The technology has near term applicability for users running large, memory-intensive workloads, like databases or AI/ML workloads, where CXL memory modules may offer a cheaper alternative to DDR5.

Backwards compatibility from one generation to the next — just like PCIe — means that decisions made during your next system refresh could influence how your datacenter is architected in the future.

And you probably won't have to wait long. The first CXL-compatible systems were supposed to launch last year. And as we've seen with Samsung's CXL memory modules announced this spring, there are already CXL products waiting for compatible systems to actually show up.

And when they do, customers will be able to deploy CXL-based memory expansion and explore tiered memory architectures right out the gate.

Customers could, for example, deploy CXL-based memory expansion and tiered memory now and know that those investments will still be relevant when memory pooling arrives with the first CXL 2.0-compatible systems a few years from now. ®


Other stories you might like

Biting the hand that feeds IT © 1998–2022