Outposts, Local Zone, Wavelength: It's a new era of distributed cloud, says AWS architect

Adrian Cockcroft talks to El Reg about cloud architecture – and why we need more chaos in our systems

re:Invent The advent of Outposts, Local Zone and Wavelength - released at AWS' Re:Invent conference in Las Vegas - amounts to a "new platform" that is now distributed rather than centralised, a company veep has told The Reg.

Adrian Cockcroft, cloud architecture strategy veep, has been at AWS "about three years", though he is also well known for leading the charge towards microservices and public cloud back when he was cloud architect for Netflix, a position he held until 2014.

"The most interesting architectural announcement [at re:Invent] is Outposts, Local Zone and Wavelength," he told us, "because this takes a bunch of the architectural assumptions about cloud, that it's a centralising influence, and turns it into a fully distributed thing."

Outposts is a rack of servers managed by AWS but physically on-premises. The customer provides the power and network connection, but everything else is done for them. If there is a fault, such as a server failure, AWS will supply a replacement for you to slide in; it is configured automatically. Outposts runs a subset of AWS services, including EC2 (VMs), EBS (block storage), container services, relational databases and analytics. S3 storage is promised for some time in 2020. Outposts was announced at re:Invent 2018, but is only now becoming generally available.

Local Zone, currently only available in Los Angeles, is an extension of an AWS Region running in close proximity to the customers that require it for low latency. The requirement in LA is for video editing.

Wavelength is a physical deployment of AWS services in data centres operated by telecommunication providers to provide low-latency services over 5G networks. Operators signed up so far include Verizon, Vodafone Business, KDDI and SK Telecom.

It turns out that these three services are closely related. "Outposts is a rack of machines. What we had to figure out is ways that we could let other people host those racks," said Cockcroft. "Local Zone, that's effectively a large clump of outposts. Wavelength is a service provided by Verizon or KDDI which lets you deploy into that, but the way we implement it is, we ship some Outposts to Verizon and they stick them near the 5G endpoints."

Unlike Outposts, Local Zone is multi-tenant. "We have to get a group of customers for us to invest in a Local Zone, there has to be enough local demand," said Cockcroft. It is unlikely that London would get one because there is already an AWS region there. Perth, on the other hand, could make a good case. "Perth is a very long way from Sydney. And we support Australia. There are mining companies who want the cloud but it's too far away. There's a lot of interest in countries where there is just one region, to create disaster recovery regions or backup regions."

For Cockcroft, this is a new architecture. "What we've done is taken a bunch of assumptions about the architecture of the cloud, that have been true for 10-15 years, and said no, that's not true any more. Now the machines can be separated over the network, we can have them deployed anywhere. People can start thinking, what is a cloud-native architecture in this new distributed world?"

Adrian Cockcroft, VP Cloud Architecture Strategy, at AWS re:Invent

Adrian Cockcroft, VP Cloud Architecture Strategy, at AWS re:Invent

Listening to Cockcroft, you would almost imagine that the ability to run on-premises is some new thing. The actual new thing is to be able to hand over management of your on-premises computing to AWS and manage it as if it were just another deployment zone.

The details of what is a recommended distributed architecture with Outposts are still emerging. "During 2020 we'll have more to talk about it," said Cockcroft.

How has best practice for architecting a resilient, scalable application changed since the days Cockcroft ran this at Netflix? "The network layer sophistication is probably the biggest change," he said. "The way that you arrange the networking traffic and segment things and the security models. It's not good enough to just have a disaster recovery site, you've also got to be secure here and secure there, and have your security architecture so that if you get a failure in your primary it doesn't kill the security architecture, they have to be independent but they also have to trust each other. There's a whole lot of interesting problems to solve around the identity and key management. There's a coordination problem of knowing what is working here and what is working there."

The complexity is such that Cockcroft believes that creating AWS Solutions, templates for best-practice deployments, will help. This was the case, he said, with data lakes. "Every customer is building data lakes. They were all doing it different ways. Most of them weren't building in role-based access control and security as a baseline thing, so we came up with Lake Formation, which is the generic 'everyone should build a data lake using this'," he said.

What about chaos engineering, the practice of inserting deliberate failures into systems in order to verify resilience? Netflix was an early advocate of this – is AWS providing tooling? "We do a bit already," Cockcroft told us. "It's piecemeal, every individual service has a different thing. Aurora has an ability to go into it and tell the database to misbehave. You can cause a master to fail, you can introduce latency, you can introduce error rates in the database. There's an interface for creating failure scenarios.

"As you look across the product line, we have to talk to every single team and say, what do you need to do to expose a few hooks where we can introduce some deliberate failures? But quite often you can do it at application level.

"The driver for this is that we have an increasing number of customers in safety and business-critical industries moving all-in to cloud. If that's a healthcare provider or an airline, or a bank or a financial institution, this has to work. We have to understand what happens and all the different possible failure modes."

According to Cockcroft, current disaster recovery plans are in many cases inadequate. Businesses "have backup data centres that they daren't failover to because they know it wouldn't work. That's the common practice. Everyone looks embarrassed if you ask too many questions about how they test their disaster recovery."

Region-to-region recovery on AWS is a better solution, he claimed, because the target looks the same as the source, whereas "every data centre is different, so every data centre failover is custom built and very poorly tested".

AWS for on-premises. AWS for cloud. AWS for edge. AWS for disaster recovery. This is the world of "all-in" and it does require putting huge trust in one provider. That is one issue to think about, but what is less controversial is that if an organisation has made that decision, it pays to do it right, and in this respect there is still a lot to learn. ®

Similar topics

Narrower topics

Other stories you might like

  • DuckDuckGo tries to explain why its browsers won't block some Microsoft web trackers
    Meanwhile, Tails 5.0 users told to stop what they're doing over Firefox flaw

    DuckDuckGo promises privacy to users of its Android, iOS browsers, and macOS browsers – yet it allows certain data to flow from third-party websites to Microsoft-owned services.

    Security researcher Zach Edwards recently conducted an audit of DuckDuckGo's mobile browsers and found that, contrary to expectations, they do not block Meta's Workplace domain, for example, from sending information to Microsoft's Bing and LinkedIn domains.

    Specifically, DuckDuckGo's software didn't stop Microsoft's trackers on the Workplace page from blabbing information about the user to Bing and LinkedIn for tailored advertising purposes. Other trackers, such as Google's, are blocked.

    Continue reading
  • Despite 'key' partnership with AWS, Meta taps up Microsoft Azure for AI work
    Someone got Zuck'd

    Meta’s AI business unit set up shop in Microsoft Azure this week and announced a strategic partnership it says will advance PyTorch development on the public cloud.

    The deal [PDF] will see Mark Zuckerberg’s umbrella company deploy machine-learning workloads on thousands of Nvidia GPUs running in Azure. While a win for Microsoft, the partnership calls in to question just how strong Meta’s commitment to Amazon Web Services (AWS) really is.

    Back in those long-gone days of December, Meta named AWS as its “key long-term strategic cloud provider." As part of that, Meta promised that if it bought any companies that used AWS, it would continue to support their use of Amazon's cloud, rather than force them off into its own private datacenters. The pact also included a vow to expand Meta’s consumption of Amazon’s cloud-based compute, storage, database, and security services.

    Continue reading
  • Atos pushes out HPC cloud services based on Nimbix tech
    Moore's Law got you down? Throw everything at the problem! Quantum, AI, cloud...

    IT services biz Atos has introduced a suite of cloud-based high-performance computing (HPC) services, based around technology gained from its purchase of cloud provider Nimbix last year.

    The Nimbix Supercomputing Suite is described by Atos as a set of flexible and secure HPC solutions available as a service. It includes access to HPC, AI, and quantum computing resources, according to the services company.

    In addition to the existing Nimbix HPC products, the updated portfolio includes a new federated supercomputing-as-a-service platform and a dedicated bare-metal service based on Atos BullSequana supercomputer hardware.

    Continue reading

Biting the hand that feeds IT © 1998–2022