DevOps

Microsoft snubs Service Fabric as it plots to switch Teams infrastructure to Kubernetes

Plus, a new Detonation Service and other explosive revelations about easing capacity constraints in lockdown

8 Got Tips?

Microsoft's CTO for Azure has opened up on both the company's response to scaling issues with Teams during the COVID-19 pandemic and future plans to switch to "container-based deployments using Azure Kubernetes Service".

The pandemic put pressure on Microsoft's cloud capacity, and chief techie Mark Russinovich describes in a written post and video how demand for services including Teams, Windows Virtual Desktop and Xbox surged as people endured lockdown – no doubt accounting for issues reported in the UK and elsewhere.

Russinovich notes that Teams daily active users expanded from 32 million earlier this year to 75 million in April. Use of the Windows Virtual Desktop service tripled in four weeks – though as a relatively new service, this will have been from a modest base. Xbox gaming saw a 50 per cent multiplayer increase, a 30 per cent increase in daily peak volumes, and a 50 per cent increase in daily new accounts.

Microsoft has put its energy into scaling Azure to meet the demand.

Building new data centres or even provisioning new servers in existing ones takes too long, so Microsoft took a number of other steps to increase capacity. Some of the measures were about gaining flexibility to spread the load, such as deploying critical microservices to more regions, and discovering that "by redeploying some of our microservices to favour a larger number of smaller compute clusters, we were able to avoid some per-cluster scaling considerations."

In addition, Microsoft made optimizations in distributing caching, switching from text-based JSON (JavaScript Object Notation) to the Protocol Buffers binary format, along with data compression, achieving "a 65 per cent reduction in payload size, 40 percent reduction in deserialization time, and 20 percent reduction in serialization time."

Disabling these animations saved an amazing 30 per cent core CPU time on the server, claims Russinovich"

There were also some "purposeful degradations," as Russinovich called them: killswitches for non-essential features. Microsoft turned off the Teams typing indicator – little animated dots that tell you someone is typing – and removed a read-receipt animation, saving a remarkable 30 per cent core CPU time during peak load.

Users annoyed by these animations may wonder why they exist, if they are so expensive. Another optimisation was to stop the mobile Teams app from automatically retrieving next week’s calendar, "which reduced request volume by 80 per cent," he said.

Some Xbox services were moved out of under-pressure regions (like Dublin) to locations such as US East, freeing capacity where it was most needed.

Microsoft Teams architecture

Russinovich discussed the architecture of Teams which he described as microservice-based though since it includes monsters like Exchange and SharePoint the "microservice" concept is getting stretched in some parts of the product. His diagram shows two things at the bottom of the stack: virtual machines, and Service Fabric. Service Fabric is Microsoft's home-grown microservice platform and one of the core services in Azure.

According to Russinovich, the company is now planning to transition to containers and Kubernetes, the microservices platform which originated from Google. Irrespective of the merits of Service Fabric versus Kubernetes, the move will as he notes, "align us with the industry," which has chosen Kubernetes as the de-facto standard.

The decision, he added, is also expected to "reduce our operating costs" and "improve our agility".

Microsoft also intends to "minimize the use of REST and favour more efficient binary protocols such as gRPC." Like Kubernetes, gRPC came out of Google. If you consider these Azure moves alongside the shift to using the Chromium browser engine in Edge, there is now a lot of Google-originated technology at Microsoft.

Getting Teams scaling nicely on Kubernetes sounds challenging but note that Microsoft is also injecting a little extra chaos, "systematically embracing chaos engineering practices to ensure all those mechanisms we put in place to make our system reliable are always fully functional," said Russinovich.

Microsoft's Detonation Service will activate suspicious links and files in a sandboxed cloud VM

Finally, the CTO for Azure introduced the Detonation service: an anti-malware service that works on links, files and attachments, copying them to a sandboxed "Detonation VM" where they are activated (opened or run) in order to inspect the outcome. ®

Sign up to our NewsletterGet IT in your inbox daily

8 Comments

Keep Reading

AWS pulls its Red Hat on with managed OpenShift collab

Plus: AWS tool for defining Kubernetes apps without having to write YAML

RHEL pusher Paul Cormier appointed CEO to lead Red Hat into the IBM era

20-year veteran takes over as Jim Whitehurst becomes Big Blue prez

From a trickle to an Application Stream: Red Hat opens barriers for RHEL 8.3 beta

System Roles another key ingredient in six-monthly update

Talk about physical to virtual translation: Red Hat officially emits OpenShift 4.4, Fedora 32 in online conference

Red Hat Summit Linux distro giant gets cosy with Microsoft among other announcements

Red Hat tips its Fedora at CoreOS Container Linux stans: Hop onto something else, folks, cos this one's on a boat to Valhalla

Support ends May 26. Users fretting over Fedora CoreOS's limitations might want to jump into Flatcar

Cloud Foundry has got its Red Hat on, hip, hip, hip, hooray: IBM demos CF running on OpenShift

CF Summit Will it be enough to boost Big Blue's cloud?

Red Hat signs off last set of numbers before it is likely gobbled by IBM

Only the Chinese now to OK $34.5bn slurp

Red Hat OpenShift 4 opens its doors to more Kubernetes goodness

Buddies up with Microsoft to stir a bit of Azure into the mix

Red Hat shoves OpenShift in VMware's software-defined data centre stack

New reference architecture rolls together containers and VMs

Red Hat slips through Platform 16 to OpenStack wizarding world, says customers still want to run their own cloud

Interview With public cloud vendors now offering on-prem, why OpenStack? Cost and avoiding lock-in – Red Hat

Tech Resources

National / Industry / Cloud Exposure Report (NICER) 2020

Rapid7’s National / Industry / Cloud Exposure Report (NICER) for 2020 is the most comprehensive census of the modern internet. In a time of global pandemic and recession, the …

Simplifying Hybrid Cloud Flash Storage

According to industry analysts, a critical element for secure hybrid multicloud environments is the storage infrastructure.

Navigating the New Era of Cloud Computing

Hear from Steve Sibley, VP of Offering Management for IBM Power Systems about how IBM Power Systems can enable hybrid cloud environments that support “build once, deploy anywhere” options.

Transparent Archiving with Komprise TMT

Archive cold data seamlessly with Komprise Transparent Move Technology. Users won't know the difference when you offload cold data, but you'll notice the cost savings in …