Amazon’s Away Teams laid bare: How AWS's hivemind of engineers develop and maintain their internal tech

Cloud giant's structure, staff practices revealed

The principles of Amazon service-oriented collaboration

Here’s how Amazon’s service-oriented collaboration works based on our research:

  1. Team structure
    • Each of the groups that owns a service has a set of goals and possibly a P&L that represents success. A roadmap is generally in place to meet those goals.
    • The teams are ostensibly autonomous and can make any important decision needed to meet their goals.
    • The "value to the customer" is part of the mission for each team. This codified using content such as mock press releases to ensure developers keep end user needs in mind.
    • As much as possible, teams are kept small, adhering to the two-pizza rule, meaning about six people.
    • Services can be refactored or new services can be spun out to new teams. Teams that don’t work are shut down and the technology they created is distributed to other teams or discarded.
    • New teams often are created to solve urgent, end-to-end problems.
  2. Development process
    • Teams use a shared set of development tools for source code and managing the development pipeline, some offered as shared services. There are many tools and services that are commonly or universally used, but no hard requirements. Every team can do what makes sense to get the job done fast. While this is true, at some point you may have to show with data why you deviated.
    • The DevOps model is fully embraced. Each team performs operational support for its service.
    • Access to most source code is not hard to get. One group can usually quite easily take a look at the source code of another without prior restraint. There are some exceptions.
    • A/B testing and detailed monitoring is widespread and used for almost every aspect of the site and infrastructure. The testing is based on the WebLab service, supported by a team that trains staff on how to make testing statistically significant.
    • Teams do not generally have to worry about the rates of internal use of resources. There is no internal currency changing hands for tracking such usage. Rates of usage internally across services are allocated as part of the budget process and monitored by finance teams who meet periodically with teams to discuss any unusual growth in services and encourage optimization.
    • Decreasing technical debt is not considered a good reason to do anything unless it has an impact on reaching the goals of the team.
  3. Collaboration practices
    • Changes to one team’s service may be implemented by another team who needs the enhanced capability by what is called an Away Team. This team works on the Home Team’s code to add what it needs according to established engineering standards and then leaves that code in good order to be maintained by the Home Team who owns the service, with help when needed.
    • When an Away Team is not an option because the requestor doesn’t have the ability to implement improvements to the service, this does lead to a management discussion about how to optimize the big picture roadmap. Usually roadmaps are bursting, so accommodating a new request means reshuffling the existing roadmap.
    • If extending a service using an Away Team doesn’t work out for some reason, it is perfectly fine to duplicate and create whatever you need to accelerate your progress. There is no concern about duplication across the platform as long as you have a need that will help you move forward.
    • A team creating a service is given credit when they do something that has a positive downstream impact on other services. Management recognizes contributions to the big picture, usually on the P&L of the higher entity.
    • "Bar raisers", Amazon staff who act as independent experts who approve key decisions, often who work on other teams, are used not only for hiring, for which they are widely known, but for high impact decisions for design, customer experience, architecture, and A/B testing. It is possible to go against the recommendation of a bar raiser, but such a move is noted and made visible to higher levels of management.

These principles operate somewhat differently based on the part of Amazon that is using them.

The oldest, original set of technology that morphed into services is generally called legacy. There is an internal platform called MAWS, which is an internal platform of services that are not public. The public form of AWS is the latest. There may be others we have not heard about.

For example, older products such as or Kindle may use services from all three of these layers. Newer products like the Alexa and Echo tend to use more of the public services on AWS.

There have been many generations of evolution from legacy to MAWS to AWS and also with respect to development tools. All of these changes happen in waves that may take years to complete.

The teams outside AWS proper are less likely to have a P&L at the service or team level. In general, AWS teams are known for having the most methodological purity, a state in which the service, team, and P&L have the same boundary.

Keep in mind this picture was assembled from talking to many people with different perspectives at different levels of the organization. It would be wonderful to make it sharper. But finding someone who knows the whole picture and detailed history is not easy. Amazon PR staff take note: we are always ready to sit down with Werner Vogels, Amazon CTO, and go over the details.

How Kurzweil and Von Hippel explain the power of service-oriented collaboration

Amazon’s model encourages direct team-to-team, service-to-service collaboration, providing principles for collaboration so that as much progress as possible can take place based on each team optimizing the services it needs directly.

As your correspondent came to understand Amazon’s model, I realized that the structure of service-oriented collaboration used levers for acceleration that have been documented by two celebrated researchers who have studied how technology development can be optimized.

MIT professor Eric Von Hippel’s research into user-driven innovation shows that when the user is given direct access to the means for creating a solution, potentially at least, tremendous innovation can result. The "sticky information" that otherwise must be rendered into requirements documents or transferred from user to builder is difficult and never complete. When this step doesn’t have to take place because user and builder are the same person or same team, the outcome is much better. Amazon’s Away Team model embraces this concept and allows teams to create building blocks that have ideal fit to purpose.

for aws teams feature

Ray Kurzweil’s analysis of the exponential pace of technology development provides another lens through which the power of Amazon’s model can be explained. Your correspondent has summarised Kurzweil’s model in Research Mission on Technology Leverage, but his thesis is as follows:

  • At first, progress in any area of technology seems slow because basic services are being developed.
  • But then, more complex services are built out of the simpler ones, and so on, accelerating the pace of development.
  • At the same time, funding goes to improving services that are most impactful.
  • As the services are used more, the fit to purpose improves.

Kurzweil’s research shows how in many different areas of technology, this pattern has held throughout history. At Amazon, my view is that we are still in the early stages of this exponential curve, which is being driven by use of services both inside and outside of Amazon.

Amazon’s model wouldn’t work without data from usage driving funding and optimization. End-to-end teams and Away Teams play a crucial role in identifying new services and improving the fit of existing services.

for aws teams feature

Right now, AWS has focused on creating general purpose higher level services that all fit into a generic platform for software development. The highest level services are being created on top of the platform by Amazon itself (, Alexa, Kindle, etc) and by AWS customers who are building all sorts of products and IT infrastructure.

Similar topics

Other stories you might like

  • AWS sent edgy appliance to the ISS and it worked – just like all the other computers up there
    Congrats, AWS, you’ve boldly gone where the Raspberry Pi has already been

    Amazon Web Services has proudly revealed that the first completely private expedition to the International Space Station carried one of its Snowcone storage appliances, and that the device worked as advertised.

    The Snowcone is a rugged shoebox-sized unit packed full of disk drives – specifically 14 terabytes of solid-state disk – a pair of VCPUs and 4GB of RAM. The latter two components mean the Snowcone can run either EC2 instances or apps written with AWS’s Greengrass IoT product. In either case, the idea is that you take a Snowcone into out-of-the-way places where connectivity is limited, collect data in situ and do some pre-processing on location. Once you return to a location where bandwidth is plentiful, it's assumed you'll upload the contents of a Snowcone into AWS and do real work on it there.

    Continue reading
  • AWS buys before it tries with quantum networking center
    Fundamental problems of qubit physics aside, the cloud giant thinks it can help

    Nothing in the quantum hardware world is fully cooked yet, but quantum computing is quite a bit further along than quantum networking – an esoteric but potentially significant technology area, particularly for ultra-secure transactions. Amazon Web Services is among those working to bring quantum connectivity from the lab to the real world. 

    Short of developing its own quantum processors, AWS has created an ecosystem around existing quantum devices and tools via its Braket (no, that's not a typo) service. While these bits and pieces focus on compute, the tech giant has turned its gaze to quantum networking.

    Alongside its Center for Quantum Computing, which it launched in late 2021, AWS has announced the launch of its Center for Quantum Networking. The latter is grandly working to solve "fundamental scientific and engineering challenges and to develop new hardware, software, and applications for quantum networks," the internet souk declared.

    Continue reading
  • Elasticsearch server with no password or encryption leaks a million records
    POS and online ordering vendor StoreHub offered free Asian info takeaways

    Researchers at security product recommendation service Safety Detectives claim they’ve found almost a million customer records wide open on an Elasticsearch server run by Malaysian point-of-sale software vendor StoreHub.

    Safety Detectives’ report states it found a StoreHub sever that stored unencrypted data and was not password protected. The security company’s researchers were therefore able to waltz in and access 1.7 billion records describing the affairs of nearly a million people, in a trove totalling over a terabyte.

    StoreHub’s wares offer point of sale and online ordering, and the vendor therefore stores data about businesses that run its product and individual buyers’ activities.

    Continue reading
  • Alibaba Cloud challenges AWS with its own custom smartNIC
    Who'll board the custom silicon bandwagon next?

    Alibaba Cloud offered a peek at its latest homegrown silicon at its annual summit this week, which it calls Cloud Infrastructure Processing Units (CIPU).

    The data processing units (DPUs), which we're told have already been deployed in a “handful” of the Chinese giant’s datacenters, offload virtualization functions associated with storage, networking, and security from the host CPU cores onto dedicated hardware.

    “The rapid increase in data volume and scale, together with higher demand for lower latency, call for the creation of new tech infrastructure,” Alibaba Cloud Intelligence President Jeff Zhang said in a release.

    Continue reading

Biting the hand that feeds IT © 1998–2022