FCC boss orders probe into 'unacceptable' T-Mobile US outage after carrier plays dog-ate-my-homework card

Yup, the old 'leased fiber line broke' excuse

T-Mobile US is attempting to pin the blame for a massive network outage on Monday on a third-party leased fiber network, though the head of America's communications watchdog has demanded a full investigation into the "unacceptable" blunder.

The mobile telco, now one of just three giants in the US mobile market after its merger with Sprint was approved, suffered a multi-hour breakdown that caused so much disruption that some feared the US had been hit by a massive distributed denial-of-service attack. Incoming calls and texts to T-Mobile US subscribers were dropped, and data services degraded.

In fact, according to the network's president of technology Neville Ray, “a leased fiber circuit failure from a third party provider in the southeast” was to blame. In a blog post published late Tuesday, Ray that the mobile carrier did have a redundancy system in place but it failed, causing an overload that reverberated across the whole network.

“We’ve worked with our vendors to build redundancy and resiliency to make sure that these types of circuit failures don’t affect customers. This redundancy failed us and resulted in an overload situation that was then compounded by other factors,” he wrote. “This overload resulted in an IP traffic storm that spread from the Southeast to create significant capacity issues.”

That explanation may not be not enough for Ajit Pai, chairman of the Federal Communications Commission (FCC), however. In Monday night, Pai tweeted: “The T-Mobile network outage is unacceptable. The FCC is launching an investigation. We're demanding answers – and so are American consumers.”

Capacity issues

That same evening, T-Mobile US CEO Mike Sievert posted a brief explanation online that the communications breakdown was “an IP traffic related issue that has created significant capacity issues in the network core throughout the day.” Both Sievert and Ray stressed that many services were still working fine, and that it wasn’t their fault.

It’s not clear if that will stick if the FCC does carry out a full probe as Pai has demanded; we have asked the FCC what its plans are. National mobile operators do use a multitude of networks, not just their own, but there is assumed to be a massive amount of resiliency and redundancy built into those systems given their critical role in everyday communications for millions of people.

T-Mobile logo

SoftBank to hang up on T-Mobile stake to shore up its balance sheet


One telecoms policy expert, Harold Feld of Public Knowledge, was skeptical of T-Mob's explanation, tweeting: "How the Hell is it possible that a huge chunk of our telecom infrastructure went down because of a single circuit (and back up) failure?"

Any investigation will dig into why T-Mobile US back-up systems failed. It will also look at whether the claims are true. There have been other FCC probes into previous outages, with some fines handed out.

In December 2018, a network meltdown within CenturyLink broke broadband and VoIP connectivity for more than a day, affecting 22 million subscribers in 39 states, and caused some 12 million calls to be black-holed or degraded. 911 emergency calls were also affected. The ISP was not fined.

Having said that, the FCC scrutinized separate outages within the 911 call system, and fined AT&T $5.25m in 2018 and CenturyLink $400,000 in 2019 for dropping some emergency calls.

If T-Mobile US is found to have been responsible for Monday's screw-up, it may face a fine, whereas if its backup systems failed, it may face a requirement to improve them albeit with no fine.

According to informed speculation on the part of Cloudflare CEO Matthew Prince, the issue may in fact have been caused by T-Mobile US engineers who were “making some changes to their network configurations” that “went badly” and resulted in a “series of cascading failures for their users, impacting both their voice and data networks.” ®

Other stories you might like

  • Despite global uncertainty, $500m hit doesn't rattle Nvidia execs
    CEO acknowledges impact of war, pandemic but says fundamentals ‘are really good’

    Nvidia is expecting a $500 million hit to its global datacenter and consumer business in the second quarter due to COVID lockdowns in China and Russia's invasion of Ukraine. Despite those and other macroeconomic concerns, executives are still optimistic about future prospects.

    "The full impact and duration of the war in Ukraine and COVID lockdowns in China is difficult to predict. However, the impact of our technology and our market opportunities remain unchanged," said Jensen Huang, Nvidia's CEO and co-founder, during the company's first-quarter earnings call.

    Those two statements might sound a little contradictory, including to some investors, particularly following the stock selloff yesterday after concerns over Russia and China prompted Nvidia to issue lower-than-expected guidance for second-quarter revenue.

    Continue reading
  • Another AI supercomputer from HPE: Champollion lands in France
    That's the second in a week following similar system in Munich also aimed at researchers

    HPE is lifting the lid on a new AI supercomputer – the second this week – aimed at building and training larger machine learning models to underpin research.

    Based at HPE's Center of Excellence in Grenoble, France, the new supercomputer is to be named Champollion after the French scholar who made advances in deciphering Egyptian hieroglyphs in the 19th century. It was built in partnership with Nvidia using AMD-based Apollo computer nodes fitted with Nvidia's A100 GPUs.

    Champollion brings together HPC and purpose-built AI technologies to train machine learning models at scale and unlock results faster, HPE said. HPE already provides HPC and AI resources from its Grenoble facilities for customers, and the broader research community to access, and said it plans to provide access to Champollion for scientists and engineers globally to accelerate testing of their AI models and research.

    Continue reading
  • Workday nearly doubles losses as waves of deals pushed back
    Figures disappoint analysts as SaaSy HR and finance application vendor navigates economic uncertainty

    HR and finance application vendor Workday's CEO, Aneel Bhusri, confirmed deal wins expected for the three-month period ending April 30 were being pushed back until later in 2022.

    The SaaS company boss was speaking as Workday recorded an operating loss of $72.8 million in its first quarter [PDF] of fiscal '23, nearly double the $38.3 million loss recorded for the same period a year earlier. Workday also saw revenue increase to $1.43 billion in the period, up 22 percent year-on-year.

    However, the company increased its revenue guidance for the full financial year. It said revenues would be between $5.537 billion and $5.557 billion, an increase of 22 percent on earlier estimates.

    Continue reading

Biting the hand that feeds IT © 1998–2022