There is a path to replace TCP in the datacenter
Forty years in, a protocol that's over the hill and under the gun, at least for the majors
One of the most entrenched standards of the last forty years, the Transmission Control Protocol (TCP), might be seeing the end of the line, at least for applications in some of the world's largest datacenters.
For the rest of the world though, the hassle factor of a shift might be too heavy to bear, even if 100X faster message delivery capabilities are within reach.
But what's good for the hyperscalers can be a win for mid-sized IT. Eventually, anyway.
Four decades ago, TCP, with its focus on networks with maybe one thousand geographically distributed nodes, often hundreds of miles apart, was truly bleeding edge. It could do the then-critical job of streaming big chunks of data over long distances and even today remains the default basis for almost every web-based technology.
The datacenter of today is, of course, wildly different. Now, we're dealing with hundreds of machines in close proximity, communicating at short time intervals. TCP was designed for a world of millisecond packet delivery from one end of the network to another, but in a datacenter this job is done in a microsecond.
"The problem with TCP is that it doesn't let us take advantage of the power of datacenter networks, the kind that make it possible to send really short messages back and forth between machines at these fine time scales," John Ousterhout, Professor of Computer Science at Stanford, told The Register. "With TCP you can't do that, the protocol was designed in so many ways that make it hard to do that."
It's not like the realization of TCP's limitations is anything new. There has been progress to bust through some of the biggest problems, including in congestion control to solve the problem of machines sending to the same target at the same time, causing a backup through the network. But these are incremental tweaks to something that is inherently not suitable, especially for the largest datacenter applications (think Google and others).
"Every design decision in TCP is wrong for the datacenter and the problem is, there's no one thing you can do to make it better, it has to change in almost every way, including the API, the very interface people use to send and receive data. It all has to change," he opined.
Of course, that's all far easier said than done. "Entrenched" doesn't begin to describe TCP. Nearly all software depends on it and in very specific ways, no less.
But Ousterhout is one of those folks in systems research who can look at an intractable problem like this and see a path forward, no rose-colored glasses necessary.
- After 40 years in tech, I see every innovation contains its dark opposite
- RISC OS: 35-year-old original Arm operating system is alive and well
- IETF publishes HTTP/3 RFC to take the web from TCP to UDP
- SmartNICs power the cloud, are enterprise datacenters next?
While his current Stanford tenure is focused on distributed systems and software, but if his name sounds familiar it's because he created technologies meant to displace things that no longer fit the times. For instance, the high-level Tcl (Tool Command Language) scripting language over three decades ago.
This led him to a career at Sun to further build that effort, then into his own Tcl support and tooling company, Scriptics. The theme running throughout his patents and research has consistently been pulling legacy tech out by the roots and replacing it with something easier and more tuned to modern systems.
His answer to the TCP time-trap is called "Homa" [PDF] and he already has an implementation of it for the Linux kernel that he says is production ready. The challenge is how to switch applications over so they can use his new interface. The grander, more distant issue is that there are millions of applications dependent on TCP.
The starting point is among the hyperscalers where this kind of fix is going to be most welcome. Most of the large-scale datacenter applications running at Google and Amazon or Azure tend to never program directly to the TCP socket interface, choosing instead to use libraries that implement remote procedure calls, where a program sends a short message to some other machine to ask it to do a task then gets a short response back.
The largest datacenter folks have frameworks that make it easier to issue those remote procedure calls (RPCs) and these are often internal tools like Google's gRPC. In Ousterhout's view, if a Google would modify its frameworks to support Homa alongside gRPC the applications that use those should only require a one-line change.
"That's the best hope for making the transition away from TCP," he tells us. "If we do that, many of the most interesting datacenter applications can take advantage of the new protocol." He adds that older applications based around TCP would keep working well but for the largest datacenter applications, the shift to Homa plus their own customized RPC tooling could mean up to 100x faster message deliver—a big deal at large scale.
There's an exhaustive list of everything that's wrong with TCP for the modern datacenter, along with some context on what it takes to start making the shift, if only conceptually. ®