CenturyLink L3 outage knocks out web giants and 3.5% of all internet traffic
Cloudflare fingers intertwined BGP and Flowspec SNAFUs
Internet backbone operator CenturyLink has experienced an outage that degraded performance of major web companies around the world.
CenturyLink acknowledged the outage and posted basic information about the incident.
Our technicians are working to resolve an IP outage. Ensuring the reliability of our services is our top priority. We will provide regular updates on our progress.— CenturyLink (@CenturyLink) August 30, 2020
We are working hard to fix an IP outage and have begun to see restoration in several areas. We’ve pulled in every resource available to resolve the outage as soon as we are able and will continue to provide additional updates as they are available.— CenturyLink (@CenturyLink) August 30, 2020
We are able to confirm that all services impacted by today’s IP outage have been restored. We understand how important these services are to our customers, and we sincerely apologize for the impact this outage caused.— CenturyLink (@CenturyLink) August 30, 2020
Border Gateway Protocol (BGP) was suggested as the cause. "Starting at 10:04 UTC, there were a significant number of BGP updates," CloudFlare's post said. "Under normal conditions, the Internet sees about 1.5MBs-2MBs of BGP updates every 15 minutes. At the start of the incident, the number of BGP updates spiked to more than 26MBs of BGP updates per 15 minute period and stayed elevated throughout the incident."
Fat-fingered Level 3 techie reduces internet to level zero: Glitch knocks out connectionsREAD MORE
CloudFlare's speculated that a bad Flowspec rule was the source of the incident and may have prevented successful BGP propagation, which just meant more BGP traffic.
Whatever the cause, the incident meant that CloudFlare was briefly not at its best and so, therefore, were some large customers including Discord, Twitter, Xbox Live, PlayStation network, Amazon and many others.
The Register cannot, however, find outage reports from many the impacted sites, suggesting that alternative routes to L3 may have been slow picking up the slack. Companies reliant on CenturyLink alone fared rather worse, but at least did so at a slowish time of the week. ®