Storage

This article is more than 1 year old

Tuesday's AWS S3-izure exposes Amazon-sized internet bottleneck

Exposes Bezos' customers' inadequate BC/DR plans, too

Wed 1 Mar 2017 // 10:30 UTC

Analysis Amazon’s S3 outage is a gift to Azure and Google, on-premises IT, hybrid cloud supporters and multi-cloud gateways. But it has also exposed inadequate business continuance and disaster recovery provisions by Amazon's business customers.

All of them can point the finger at Jeff Bezos and say AWS let users down. And now we know how important it us that users shouldn’t put all their trust in it. They should have an alternative or hybrid cloud strategy. Hundreds of marketing flacks are writing their post-outage prescriptions along these lines right now.

S3 (Simple Storage Service) is Amazon’s object storage facility in its public cloud. The S3 outage took place yesterday, February 28 at 9.44am PT with storage bucket access problems, due to high error rates, in its US-East-1 region (North Virginia), a highly popular data centre. For many users their data was inaccessible and services curtailed over a five-hour breakdown period. Even Nest videocams and smartphone apps were affected.

For many S3 app developers, having data in two regions for redundancy to protect against such outages was an expensive step they didn’t take – ouch.

As well as S3 there were also problems with – get this – Amazon Appstream 2.0, Athena, CloudSearch, Cognito, ECR (Docker container registry), EMR), Amazon Elastic Transcoder, Elasticsearch Service, Glacier, Inspector, Kinesis Firehose, Lightsail, Mobile Analytics, PinPoint, Redshift, Simple Email Service, SWF, WorkDocs, WorkMail, Auto Scaling, and AWS Batch, CloudFormation, CodeBuild, CodeCommit, CodeDeploy, Data Pipeline, Elastic Breanstalk, Key Management, Lambda, OpsWork Stacks, and Storage Gateway, in the North Virginia part of AWS’s infrastructure.

Most are resolved but some are still current. The situation was overwhelmingly complex and this AWS EC2 status history pop-up for EC2 (N. Virginia) - US-East-1 - viewed today indicates some of this:

EC2 status history pop-up from AWS for N. Virginia.

Amazon is not explaining how and why this great collection of incidents happened:

AWS status update

What should the tech giant be doing?

For Amazon, it seems clear that its US-East-1 region needs splitting into further smaller failure domains, beyond the secondary US-East-2 region in Ohio. It also needs to separate out its online public dashboard infrastructure, so that it survives a US-East-1 r other regional database failure.

For alternative suppliers this is an obvious marketing gift. Egnyte CEO and co-founder Vineet Jain was quick off the mark with a comment ready for inquiring hacks:

The Internet and the cloud are not perfect. While many folks like to think we are becoming less susceptible to outages, they are still a fact of life and cannot be taken lightly - as evidenced by Amazon today. Whether you are a small business whose ability to complete transactions is halted, or you are a large enterprise whose international operations are disrupted, if you rely solely on the cloud it can do significant damage to your business.
While the outage has shown the massive footprint that AWS has, it also showed how badly they need a hybrid piece to their solution. Hybrid remains the best pragmatic approach for businesses who are working in the cloud, protecting them from downtime, money lost, and a number of other problems caused by outages like today.

The public cloud in general is not borked. Putting your IT operational trust in one data centre in one public cloud supplier, however big that supplier is, has been shown to be risky. The message is simple: don’t be a cheapskate. Pay the extra cost of data centre duplication for your, and your customers' peace of mind.

At the end of the day, so to speak, every supplier affected by this outage had an inadequate business continuity and disaster recovery plan. Yes, señor supplier, Amazon let you down, but you let your own customers down too. ®

Topics

Special Features

Vendor Voice

Resources

Storage

Tuesday's AWS S3-izure exposes Amazon-sized internet bottleneck

Exposes Bezos' customers' inadequate BC/DR plans, too

What should the tech giant be doing?

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

AWS must pay $525M to cloud storage patent holder, says jury

Snowmobile, Amazon's truck-powered migration service, reaches the end of the road

US-EAST-1 region is not the cloudy crock it's made out to be, claims AWS EC2 boss

Protecting distributed branch office environments from ransomware

Irish power crunch could be prompting AWS to ration compute resources

GenAI will be bigger than the cloud or the internet, Amazon CEO hopes

UK govt office admits ability to negotiate billions in cloud spending curbed by vendor lock-in

AWS severs connection with several hundred staff

Amazon to lure upstarts with $500K in AWS AI credits each

Microsoft hiring Inflection team triggers interest from EU's antitrust chief

Stability AI reportedly ran out of cash to pay its bills for rented cloudy GPUs

Amazon finishes pumping $4B into AI darling Anthropic

About Us

Our Websites

Your Privacy