This article is more than 1 year old
Xero, Slack suffer outages just as Let's Encrypt root cert expiry downs other websites, services
'The internet is a complex system'
Updated Websites and apps are suffering or have suffered outages around the world for at least some netizens today due to connectivity issues.
Though the exact causes of the IT breakdowns are in many cases not fully known right now, there has been a sudden uptick in downtime right as Let's Encrypt, which provides free HTTPS certificates to a ton of organizations, let one of its root and intermediate certs expire.
This expiration should be invisible to software, services, and users relying on the certificates for encryption, tamper-proof communications and whatnot, however not all systems appear to have handled the expiry well. Thus, it is assumed the expiration and at least some of today's outages are interlinked. Other downtime, such as Slack's teetering, is not tied to Let's Encrypt.
Specifically, today, September 30, an older root certificate – DST Root CA X3 – which was used to underpin HTTPS certs issued by Let's Encrypt, expired as planned along with its R3 intermediate. There's more technical info on that here.
Apps and websites should instead now use, or continue to use, Let's Encrypt's ISRG Root X1 certificate and its replacement R3 intermediate automatically, and this whole process should pretty much be transparent to users. Let's Encrypt warned "older devices" will need a prod to catch up with and trust the latest foundational certs.
However, it appears it's more than "older devices" that have tripped up and stopped working as expected – such as refusing to establish secure connections to remote systems – and that may be due to an issue with the now-expired R3 intermediate, or because your client doesn't recognize the ISRG root cert.
If you're having trouble with Git and servers using broken Let's Encrypt certs, for instance, there's a thread on fixing that here.
- Let's Encrypt completes huge upgrade, can now rip and replace 200 million security certs in 'worst case scenario'
- Let's Encrypt warns about a third of Android devices will from next year stumble over sites that use its certs
- Lettuce Encrypt, Encrypt We Must: Hobby projects change name after Let's Encrypt fires off trademark complaints
HTTPS cert expert Scott Helme has been tracking the certificate expiry and reporting what's falling over on Twitter, likely as a result of the cert changes, right here.
These include Catchpoint System Services; OVH; OpenBSD; Facebook for developers; Fortinet; parts of Netify; and Cisco Umbrella.
Heroku also suffered a wobble with its metrics and Connect Dashboard though it was unrelated to the cert death, which was a separate issue for its customers. Shopify and parts of Cloudflare also encountered mysterious downtime that appears to be unrelated to Let's Encrypt.
"There are also many reports of iOS and macOS versions newer than expected seeing issues on sites serving the expired R3 intermediate," Helme said. "I've seen errors on iOS 11, 13 and 14 along with several macOS version only a few minor releases behind current."
A spokesperson for the Let's Encrypt organization told us earlier today: "We are monitoring the expiration and providing advice when we can. The internet is a complex system and any change has knock-on effects. We are not currently seeing more than what we might describe as 'expected' issues for any change like this."
Meanwhile, we've spotted that invoicing software giant Xero has been down for hours right as people try to get their submissions in by the end of the month, and Slack has had connectivity issues. It's not clear why Xero fell over; we've asked for an explanation.
For Slack, we're told it was a DNSSEC issue rather than a certificate problem. Essentially, as described here, Slack made a change to its DNS records, backed out the update, this adjustment is still propagating through the internet, and as a result, looking up the site's servers may fail for some people.
"This issue was caused by our own change and not related to any third-party DNS software and services," the chat app biz noted. In order to resolve this faster, your ISP will need to flush their DNS record for slack.com."
"Less than one per cent of users" were affected, it claimed. We note that Slack has 12 million or more daily active users, so you do the math.
We'll keep you posted on any updates. If your app or site isn't working, consult its support desk, mailing lists, or forums for any upgrades to apply or commands to run, or wait for code to automatically pick up today's changes. There's a long help thread here on the Let's Encrypt community board. ®
Updated to add
A spokesperson for Xero told us regarding today's outage: "We have identified the underlying issue that has been causing disruption to customers accessing Xero.
"This was related to a platform certificate issue at approximately 2pm UTC which we have now resolved. We have implemented a fix and our service is recovering."
Let's Encrypt's older root cert expired at, funnily enough, 2pm UTC today when Xero started breaking down. We asked if there was a connection between the outage and the expiry of DST Root CA X3.
"This issue was caused when a certificate our systems required expired and some of our subsystems did not automatically trust the new certificate," the spokesperson said, adding: "It is not related to any third-party provider."