I got 502 problems, and Cloudflare sure is one: Outage interrupts your El Reg-reading pleasure for almost half an hour

A chunk of the internet vanished today. Lucky it's not used for anything important, right?

Updated Cloudflare, the outfit noted for the slogan "helping build a better internet", had another wobble today as "network performance issues" rendered websites around the globe inaccessible.

The US tech biz updated its status page at 1352 UTC to indicate that it was aware of issues, but things began tottering quite a bit earlier. Since Cloudflare handles services used by a good portion of the world's websites, such as El Reg, including content delivery, DNS and DDoS protection, when it sneezes, a chunk of the internet has to go and have a bit of a lie down. That means netizens were unable to access many top sites globally.

A stumble last week was attributed to the antics of Verizon by CTO John Graham-Cumming. As for today's shenanigans? We contacted the company, but they've yet to give us an explanation.

While Cloudflare implemented a fix by 1415 UTC and declared things resolved by 1457 UTC, a good portion of internet users noticed things had gone very south for many, many sites.

The company's CEO took to Twitter to proffer an explanation for why things had fallen over, fingering a colossal spike in CPU usage as the cause while gently nudging the more wild conspiracy theories away from the whole DDoS thing.

However, the outage was a salutary reminder of the fragility of the internet as even Firefox fans found their beloved browser unable to resolve URLs.

Ever keen to share in the ups and downs of life, even Cloudflare's site also reported the dread 502 error.

As with the last incident, users who endured the less-than-an-hour of disconnection would do well to remember that the internet is a brittle thing. And Cloudflare would do well to remember that its customers will be pondering if maybe they depend on its services just a little too much.

Updated to add at 1702 BST

Following publication of this article, Cloudflare released a blog post stating the "CPU spike was caused by a bad software deploy that was rolled back. Once rolled back the service returned to normal operation and all domains using Cloudflare returned to normal traffic levels."

Naturally it then added....

"We are incredibly sorry that this incident occurred. Internal teams are meeting as I write performing a full post-mortem to understand how this occurred and how we prevent this from ever occurring again." ®

Similar topics

Other stories you might like

  • Lenovo halves its ThinkPad workstation range
    Two becomes one as ThinkPad P16 stands alone and HX replaces mobile Xeon

    Lenovo has halved its range of portable workstations.

    The Chinese PC giant this week announced the ThinkPad P16. The loved-by-some ThinkPad P15 and P17 are to be retired, The Register has confirmed.

    The P16 machine runs Intel 12th Gen HX CPUs, but only up to the i7 models – so maxes out at 14 cores and 4.8GHz clock speed. The laptop is certified to run Red Hat Enterprise Linux, and can ship with that, Ubuntu, and Windows 11 or 10. The latter is pre-installed as a downgrade right under Windows 11.

    Continue reading
  • US won’t prosecute ‘good faith’ security researchers under CFAA
    Well, that clears things up? Maybe not.

    The US Justice Department has directed prosecutors not to charge "good-faith security researchers" with violating the Computer Fraud and Abuse Act (CFAA) if their reasons for hacking are ethical — things like bug hunting, responsible vulnerability disclosure, or above-board penetration testing.

    Good-faith, according to the policy [PDF], means using a computer "solely for purposes of good-faith testing, investigation, and/or correction of a security flaw or vulnerability."

    Additionally, this activity must be "carried out in a manner designed to avoid any harm to individuals or the public, and where the information derived from the activity is used primarily to promote the security or safety of the class of devices, machines, or online services to which the accessed computer belongs, or those who use such devices, machines, or online services."

    Continue reading
  • Intel plans immersion lab to chill its power-hungry chips
    AI chips are sucking down 600W+ and the solution could be to drown them.

    Intel this week unveiled a $700 million sustainability initiative to try innovative liquid and immersion cooling technologies to the datacenter.

    The project will see Intel construct a 200,000-square-foot "mega lab" approximately 20 miles west of Portland at its Hillsboro campus, where the chipmaker will qualify, test, and demo its expansive — and power hungry — datacenter portfolio using a variety of cooling tech.

    Alongside the lab, the x86 giant unveiled an open reference design for immersion cooling systems for its chips that is being developed by Intel Taiwan. The chip giant is hoping to bring other Taiwanese manufacturers into the fold and it'll then be rolled out globally.

    Continue reading

Biting the hand that feeds IT © 1998–2022