This article is more than 1 year old
GitHub's journey towards microservices and more: 'We actually have our own version of Ruby that we maintain'
The Reg talks to Software Engineering veep Sha Ma
Interview GitHub has described efforts to break down its monolithic application architecture into microservices – and revealed that it still runs some services on AWS, even after the 2018 acquisition by Microsoft.
Sha Ma, VP of Software Engineering at GitHub spoke on the subject at the November Qcon Plus virtual developer event and spent some time with us afterwards.
The online code shack is among the world's busiest sites, used by over 50 million developers and hosting over 100 million repositories.
GitHub was first built in 2008 using the Ruby on Rails web application framework. "GitHub's architecture is deeply rooted in Ruby on Rails," said Ma, adding that "a monolithic architecture got us pretty far," including multiple code deploys every day and high scale, "serving over a billion API calls daily."
People should still be able to work through command-line interfaces. That's why we're making it a priority to extract authentication as a core service outside the monolith, for enablement, so now people can use more of our systems when the web front end is not available...
The scale of the site demonstrates that claims that Ruby on Rails or a monolithic architecture do not scale are false. Why then is GitHub now migrating?
The decision is quite recent, Ma told us. "It really started at the beginning of this year. We acquired so many companies, like we've acquired Semmle which is based in Oxford, their primary stuff is in C and Python. And we've acquired Dependabot, NPM which is package management in JavaScript, and Pull Panda. Internally, we've also merged a few of the sister teams that were within Microsoft, so a lot of folks from Azure DevOps are now part of our team, and they are used to working in anything from C# to TypeScript.
"All that diversity which joined GitHub, which used to be just a Ruby on Rails shop, prompted us to think: how do we enable developers that have brought diversity in tech stack and skill set to be productive working together? That made us realise that the monolith as a sole development option is no longer viable."
Does that mean GitHub is migrating away from Ruby?
"Our strategy is not a complete replacement," said Ma. "The founders of GitHub were very deeply rooted in the Ruby community. They were contributors." GitHub has also hired leading Ruby developers over the years. "We actually have our own version of Ruby that we maintain," she told us.
"When things work well with GitHub then we contribute back into the Ruby open source... There are things we've done with the Ruby code base that are highly custom to make GitHub as performant as possible. We know we're never going to get away from that completely, and a lot of people are still very productive in that code base. For us it's going to be a hybrid environment for the foreseeable future."
On occasion, performance-critical code is written in other languages. "When we extracted authorization, we ended up rewriting that service outside the monolith, in Go," she said.
What about MySQL?
While MySQL has performed well overall, it has also been identified as an issue in some of the company's outage reports. Has GitHub considered migrating to a different database manager?
"Similar to Ruby, we have world-renowned DBAs [database administrators] that have scaled a lot of systems. For the foreseeable future we're going to remain with MySQL just because we have a lot of expertise there," Ma said.
Further, the process of splitting out functional groups into microservices will also enable some breaking apart of the data. "Even as a monolith, we've been able to scale," Ma told us. "We feel that our MySQL solution still has quite a bit of runway for us, both from a performance perspective and in terms of how much we can store."
How has the Microsoft acquisition influenced GitHub's infrastructure? Before the acquisition, GitHub was largely hosted on its own data centres. Is that moving to Azure?
"We're exploring things potentially to move," Ma said. "We actually still have things hosted on AWS. For example, a lot of our data analytics is on AWS and we've started a project to look at migration into Azure, especially since we get internal pricing which is more favourable for us. But a large part, I would say 80-90 per cent of our stuff is hosted in data centres that we physically maintain."
Migration steps and pitfalls
At Qcon, Ma explained some of the work the company is doing to enable its migration. "Good architecture starts with modularity," she said. "The first step towards breaking up a monolith is to think about the separation of code and data based on feature functionality. This can be done within the monolith before physically separating them in a microservices environment."
She also described some of the pitfalls. "I've seen a lot of cases where people start by pulling out the code logic, but still rely on calls into a shared database inside the monolith. This often leads to a distributed monolith which ends up being the worst of both worlds, having to manage the complexities of microservices without any of the benefits."
Ma explained that "It's important to keep in mind that dependency direction should always go from inside of the monolith to outside of the monolith and not the other way around."
"Getting data separation right is a cornerstone in migrating from a monolithic architecture," Ma said. "For example, we grouped everything related to repositories together, everything related to users together, and everything related to projects together... creating functional groups of database schemas will eventually help us safely split the data onto different servers and clusters needed for a microservices architecture."
This process means fixing database queries that cross these domain boundaries. "At GitHub we implemented a query watcher in the monolith to alert us any time a query crosses functional domains. We would then rewrite these queries into multiple queries that respect the domain boundaries and perform any necessary joins at the application layer," said Ma.
How far along is GitHub in its migration towards cloud native? "Not very far," Ma told us. "We're in the very early stages, I think because there is lot of knowledge accumulated over the years including how to fine-tune Ruby and how to fine-tune MySQL to make the site as performant as it is today. Even if we do explore cloud native solutions, it will probably be newer services that are not core to GitHub itself, like Actions, or even Projects as it's getting rebuilt."
Does GitHub use Kubernetes to orchestrate containers? "We're very much using Kubernetes. In order to support multi-language variations and new services that are being created, a year and a half ago we started templatizing a lot of things that are common across multiple teams, so we have what we call microservices in a box, that has Kubernetes templates. We know that every service needs logging so we automatically log into Splunk. We know every service will need to be deployed so there is automatic deployment into our existing deployment process. So people can get up and running quickly on the operational side of things."
During the migration period, does GitHub add code to the monolith at the same time as writing new microservices? "Yes, absolutely," said Ma. "Because our strategy is enablement and not replacement, the code in the monolith needs to be maintained and improved and we're still doing that." The idea though is that when a microservice is ready, it should be used 100 per cent in place of existing code so "you don't have to maintain multiple versions inside and outside the monolith."
Extracting authentication is a priority, because of its role in letting developers continue to work if the website is down. "If GitHub is down, people can't actually perform any Git operations, and that's problematic," Ma told us. "People should still be able to work through command-line interfaces. That's why we're making it a priority to extract authentication as a core service outside the monolith, for enablement, so now people can use more of our systems when the web front end is not available."
Is there a target for when GitHub will be able to say it has a microservices architecture? "I would say years," Ma told us. "This shift is not just an architecture decision. It is also a cultural shift … I think eventually the gravitational pull will shift towards all the new services being built as microservices, and that a lot of the existing services will have been rebuilt and refactored out of the monolith, but for the foreseeable future we will still be operating at least a set of core services that will be part of the monolith."
It is a pragmatic approach. "Microservices is not your solution to technical debt and bad architecture," Ma told us. "I think there's been a trend of people who went down the microservices path and are now going back into monolith because microservices became too unwieldy for them. Microservices doesn't replace good architecture. Going through things like, what should be grouped together? How should we look for things that cross domain boundaries? How should we set up teams and on-call? pushed us towards better architectural practices that benefit us both in the monolithic and microservice world. A lot of the preparatory work we're doing, we're actually doing in the monolith before extracting it." ®