Everything you wanted to know about modern network congestion control but were perhaps too afraid to ask

In which a little unfairness can be quite beneficial


Systems Approach It’s hard not to be amazed by the amount of active research on congestion control over the past 30-plus years. From theory to practice, and with more than its fair share of flame wars, the question of how to manage congestion in the network is a technical challenge that resists an optimal solution while offering countless options for incremental improvement.

This seems like a good time to take stock of where we are, and ask ourselves what might happen next.

Congestion control is fundamentally an issue of resource allocation — trying to meet the competing demands that applications have for resources (in a network, these are primarily link bandwidth and router buffers), which ultimately reduces to deciding when to say no and to whom. The best framing of the problem I know traces back to a paper [PDF] by Frank Kelly in 1997, when he characterized congestion control as “a distributed algorithm to share network resources among competing sources, where the goal is to choose source rate so as to maximize aggregate source utility subject to capacity constraints.”

This is a hard problem in any computer system, but what makes it especially difficult in this case is the decentralized nature of the Internet. Not only is the Internet’s approach to congestion control distributed across its millions of hosts and routers, it is fair to think of them as cooperatively trying to achieve a globally optimal solution. From this perspective, there is a shared objective function, and all the elements are implementing a distributed algorithm to optimize that function. Compounding the challenge, and arguably the reason there are so many approaches to congestion control, is the lack of consensus on the right objective function.

Of course everyone wants high throughput, low latency, stability, and fairness, but it’s how you trade those off against each other that makes it a tough question to answer. To make matters worse, the problem is over-constrained, meaning that each solution must choose which constraints to give the most – and least – weight to.

Fairness has been a particularly thorny issue, not only when considering a given algorithm competing against itself, but especially when comparing algorithm A with algorithm B. If A is able to measure improved throughput over B, but it does so by being more aggressive, and hence, stealing bandwidth from B’s flows, then A’s improvement is not fairly gained and may be discounted. In such an analysis, Jain’s fairness index [PDF] has historically been used as a quantitative measure of fairness.

Everyone wants high throughput, low latency, stability, and fairness, but it’s how you trade those off against each other that makes it a tough question to answer

Fairness arguments have been used for 30 years in favor of the incumbent algorithm (whatever it happens to be), making Ranysha Ware’s recent proposal [PDF] to measure harm instead of fairness a welcome breath of fresh air in this ongoing debate. Ware et al advocate for a threshold based on harm, as measured by a reduction in throughput or an increase in latency. Intuitively, if the amount of harm caused by flows using a new mechanism B on flows using existing mechanism A is within a bound derived from how much harm A-managed flows cause other A-managed flows, we can consider B deployable alongside A without harm.

Unfortunately, replacing fairness with harm doesn’t eliminate the over-constrained nature of the problem. But what might is the proliferation of purpose-built mechanisms targeted at specific usage domains. Based on my reading of a lot of the recent research, this is where congestion control algorithms seem to be headed.

Data Center TCP (DCTCP) is a great example. It targets congestion within the data center, and so assumes link bandwidths are 10Gbps or higher and RTTs are typically measured in the tens of microseconds.

It doesn’t have to be responsive to such a wide-range of operating parameters as the Internet as a whole. Even more importantly, because the data center is contained within a single administrative domain, it is possible to enable features like Explicit Congestion Notification (ECN) on every single switch, and optimize the algorithm to take advantage of that information.

You could argue that the data center is a special case, but I’m not so sure. Google’s Bottleneck Bandwidth and RTT (BBR) mechanism is another example worth considering.

It is general-purpose in that it attempts to handle the wide range of scenarios that any Internet-worthy algorithm would have to respond to, but it’s not particularly fair when competing with non-BBR flows. But that doesn’t matter if it’s only deployed within Google’s backbone, interconnecting their data centers. In that case, BBR only has to be fair to its own flows. Is it possible to generalize from this example?

Perhaps it is. In the early decades of the Internet, end-to-end meant between any two hosts in the Internet. But today, direct communication between arbitrary hosts is the exception, not the rule. Most TCP connections on today’s Internet can be categorized as follows: edge-to-cloud (between an end user device and the nearest cloud hosting center or CDN server); cloud-to-cloud (typically over a provider’s backbone); or intracloud (between two servers within a data center).

The second two are usage domains that already run purpose-built TCP congestion control algorithms, as covered above. And the first is well on its way to being reinvented — with the rollout of 5G (and all the spectrum-allocation algorithms 5G implies) for mobile devices, and with cloud-provider last-mile offerings for our streaming and gaming devices at home.

None of this means the resource allocation problem goes away, but it might become more tractable as the deployments become more customized. Most importantly, we won’t have to agree to one universal mechanism, except as a fallback mechanism for those rare occasions when true peer-to-peer transfers are required. This may very well accelerate the rate at which congestion control papers are published, since the “it’s not fair” argument is no longer relevant.

Of course, if the Internet truly re-decentralizes in the coming years, congestion control might once again need to address global optimization problems, but perhaps that will happen without a single incumbent algorithm being favored by default. It seems congestion control researchers have a secure future. ®

Larry Peterson and Bruce Davie are the authors of Computer Networks: A Systems Approach and the related Systems Approach series of books. All their content is open source and available on GitHub. You can find them on Twitter, their writings on Substack, and past The Register columns here.


Other stories you might like

Biting the hand that feeds IT © 1998–2021