How does Monzo keep 1,600 microservices spinning? Go, clean code, and a strong team

Well-known software development principles count for more than technology choices

QCon London Software engineers from digital bank Monzo told developers at the QCon event in London how and why it runs its banking systems on 1,600 microservices.

Monzo's session at QCon was in stark contrast to Monday's presentation in which Sam Newman warned that a microservices architecture is a "last resort". Senior engineers Matt Heath and Suhail Patel described how microservices work well for the bank, founded in 2015 and now with over 4 million customers.

As a new business that hoped to grow substantially, Monzo had a requirement for a technology platform that was extensible, scalable, resilient and secure. The idea was to start with a few basic banking services, and then be able to add more as time and resources allowed.

Monzo was convinced early on of the value of a distributed system. The bank did not want a single, big resilient system with a complex failover that you hope never has to run.

"If you don't exercise those failover modes, how can you know that they work reliably?" said Heath. They started with Mesos for cluster management, but by 2016 they switched to Kubernetes (K8s) as the "emerging market leader".

One of the goals was to abstract the complexity of infrastructure. "We think all the complexities about scaling of infrastructure, making sure that services are provisioned and databases are available, should be dealt with by a specific team, so that engineers can focus on the product," said Patel. The systems run on Amazon Web Services (AWS).

Matt Heath and Suhail Patel present at QCon London

K8s has not been completely pain-free. In 2017, Monzo "had quite a large outage, because a problem with K8s and how it interacted with etcd and linkerd, due to a combination of different bugs that were quite hard to test," said Heath.

Monzo picked Cassandra as the database because it scales horizontally (meaning you can simply add more hardware to scale, rather than having to migrate to a bigger system).

On the coding side: "We use Go as our primary programming language," said Heath. "It's quite simple, it's statically typed, and it makes it easy for us to get people on board." Go has a backwards-compatibility guarantee, meaning that when a new version appears, you can simply recompile existing code, getting the benefits of updates to features such as the garbage collector. Go is also well suited to strict policies about error handling.

"We have static analysis to make sure that you are not papering over errors," said Patel.

Banking systems are well suited to a modular approach. There is a requirement to link to many different systems such as BACS, CHAPS, Visa, Mastercard, Apple Pay and Google Pay. "Adding those things as separate systems allows us to keep them simpler," said Heath. Monzo builds integrations as much as possible in-house, rather than using third-party implementations, to get more control over resilience and performance (and likely saving money in the long term as well). They even built their own chat system, used internally and for support.

Monzo has also built its own tools for interacting with AWS and K8s, such as one called Shipper, which can deploy or rollback an individual service. Shipper can deploy directly from a pull request, which represents an update to code maintained in a Git repository.

Each Monzo microservice runs in a Docker container. "One of our biggest decisions was our approach to writing microservices," said Patel. There is a shared core library, which is available in every service; this is essentially copied in every container, though the build process will strip out unused code. This means that "engineers are not rewriting core abstractions like marshalling of data". It also enables metrics for every service so that after deployment it immediately shows up in a dashboard with analysis of CPU usage, network calls and so on. Automated alerting will identify degraded services.

Monzo has an extensive shared library available within every microservice

A lot of thought goes into the interface or API that each service exposes. The team favours writing many small services, each dedicated to a single purpose, rather than fewer more complex services. "Why do we have such granularity? We want to minimise the risk of change," said Patel. "For example, if we want to change the way contactless payments work, we're not affecting the chip and PIN system."

How do developers work on their code, given that running 1,600 microservices is not going to work on a laptop? "You are running a subset," said Heath. "We have an RPC filter that can detect you are trying to send a request to a downstream that isn't currently running, it can compile it, start it, and then send the request to it."

Why do microservices work for Monzo, whereas in some cases they add complexity without delivering much benefit? The Register has attended many QCon events, and while software development trends have changed from year to year, some things have remained consistent. One is that the way developers interact with each other in a team (and with management) counts for more than whatever development methodology they espouse. Another is that an incremental approach wins over occasional large changes. "An iterative process is generally what we take to heart at Monzo," said Heath, "both from an infrastructure perspective but also from a product perspective. By making small changes frequently we make sure we are going in the right direction."

Another recurrent QCon theme is the advantage of simplicity over complexity. Taken as a whole, what Monzo's system does is highly complex, but it has designed its systems in such as way as to divide that complexity into smaller, simpler pieces, and abstract it away from developers working on the code. "You don't need to know how 1,600 services work," said Heath. The hard task of managing K8s is delegated to specialists.

Monzo has also standardised "a small set of technology choices", said Heath, so that "as a group, we can collectively improve those tools". This could be frustrating for developers who have different technology preferences, but must help substantially with collaboration since everyone learns the same set of tools. "Code needs to be readable to other humans," said Patel. "We optimise code for readability. One of our engineering principles is not to optimise [performance] unless it is a bottleneck."

What Monzo presented at QCon seemed to be a strong template for software development and deployment in the case where you have a complex system with many components, and need to be able to respond quickly when requirements change or features are added.

Heath and Patel made a great case for the value of microservices. Note, though, that Monzo uses a lot of custom, in-house tools and libraries that are not easy to replicate. Further, many of the principles they presented – like writing clean, readable and disciplined code, focusing on a few carefully chosen pieces of technology, and taking an incremental approach – are winners in any software architecture. ®

Send us news

How would you sum up a decade of Kubernetes?

The CNCF is looking for a tenth anniversary logo

CNCF boss talks 'irrational exuberance' in an AI-heavy Kubecon keynote

Kubecon? More like Queuecon as Paris show's registration system experiences temporary borkage

Docker launches Testcontainers on former rival Red Hat's OpenShift

CEO Scott Johnston on company pivots and trying not to surprise the community

Companies flush money down the drain with overfed Kubernetes cloud clusters

Just 13% of provisioned CPUs, 20% of memory utilized, study finds

GitOps pioneer Weaveworks unravels after funding fabric frays

Company burned through $61.6M in investment

Kubernetes' Tim Hockin on a decade of dominance and the future of AI in open source

Going back to a time before autocomplete

Unpatched NGINX ingress controller bugs can be abused to steal Kubernetes cluster secrets

Just tricks, no treats with these 3 vulns

The Cloud Native Computing Foundation leaps aboard the AI bandwagon

Nice tech, but poke underneath and you'll find Kubernetes

Automating cloud infrastructure: Do you want APIs with that?

Flipping the script to a control plane

D2iQ's AI Navigator ready to answer your deepest cloud concerns

Kubernetes configuration laid bare by chatbot, customer context next on roadmap

From browser brat to backend boss: Will WASM win the web wars?

WebAssembly is getting a lot of hype, but is it the game-changer some think it is?

If the Linux Foundation was a software company, it'd be the biggest in the world

The Kubernetes circus hits Shanghai and ponders how to connect engineers