Network congestion algorithms have design flaw, says MIT

In the battle between bandwidth, latency, and starvation, we'll always be choosing two

Trying to create a network that's fair, equitable, and starvation-free may simply be impossible – at least with current congestion control algorithms (CCA), an MIT study has found.

Published ahead of a conference presentation this week, the MIT paper [PDF] found that, regardless of approach, CCAs like Google's BBR, FAST, and others all suffer from the physical limitations of networks, leading to an unavoidable problem where some users are starved of bandwidth. 

"Our theorem shows that CCAs have to choose at most two out of three properties: high throughput, convergence to a small and bounded delay range, and no starvation," the researchers said in the paper. 

The paper cites a number of non-congestive network issues, like ACK aggregation and end-host scheduling, which serve to upset tight algorithmic controls that rely on estimates to account for happenings on a network it can't control.

In an ideal situation, the researchers wrote, CCAs operating on a single network are designed to converge and work together to create the smallest possible delay range. According to the researchers, here's where the problem lies. 

"Because most CCAs attempt to work across many orders of magnitude of rates, they must map a large rate range into a small delay range. Thus, even small changes in estimated queuing delay would induce enormous changes," the team wrote. 

In other words, algorithms account for a lot, but they simply can't factor physical imperfections in the real world or non-congestion delays into their calculations.

Can we build better CCAs?

The paper admits its conclusions are "pessimal for delay-bounding CCAs," and raises the question of "whether we are doomed to choose between bounding delays and avoiding starvation."

Speaking to IEEE Spectrum, lead study author and MIT computer scientist Venkat Arun said that his team's discovery sheds new light on CCA problems previously attributed to poor algorithmic decision-making and insufficient network capacity. 

Instead, Arun and his team's research shows that the algorithms themselves simply aren't designed to account for network jitter, which the paper uses to refer to non-congestive causes of network delays. "We do not believe it is possible to circumvent this problem with algorithms that map loss rates (or delays) to sending rates," the team wrote. 

This doesn't mean the MIT team doesn't have suggestions on how to solve this seemingly unavoidable network management problem. In the paper, the team makes several suggestions that basically amount to increasing algorithmic queue times to account for jitter.

Still, even that might not be enough, the team concluded. "It is also possible that purely end-to-end CCAs might always suffer from the issues we found, and in-network support such as active queue management, explicit congestion signaling, or stronger isolation is required." ®

Similar topics


Other stories you might like

Biting the hand that feeds IT © 1998–2022