Systems

This article is more than 1 year old

Discord details how it dodged latency with a super-disk made in the cloud

For when a GCP Local SSD is just not quite reliable enough

Tue 16 Aug 2022 // 17:38 UTC

Chat platform Discord delivered a playful slap to Google yesterday with a post describing how the company dealt with "reliability issues" to achieve some impressively low latency.

Discord deals with 4 billion messages sent through the platform per day by its millions of users. The company runs a set of NoSQL database clusters (powered by ScyllaDB) but its real-time nature means that the databases need to respond to queries as quickly as possible.

"The biggest impact on our database performance is the latency of individual disk operations, how long it takes to read or write data from the physical hardware," said Glen Oakley, a senior software engineer at Discord.

Below a certain query rate, all is good. "Our databases do a great job of handling requests in parallel," said Oakley.

However, at some point you will hit blocking issues, where the database has to wait for an outstanding disk operation to complete before starting another. Things slow down, and users notice. The queries might time out before reaching the top of the queue.

One might have thought that slinging the Local SSDs on offer from GCP would deal with the problem. Oakley noted that the NVMe-based storage had incredibly fast latency profiles, but "in our testing, we ran into enough reliability issues that we didn't feel comfortable with depending on this solution for our critical data storage."

Another option was persistent disks, storage that could be attached or detached when needed, replicated, and connected via the network. So nowhere near as low latency as a directly attached disk.

So what to do? The team wanted to stick with GCP and prioritize low-latency disk reads, but did not want to sacrifice existing uptime guarantees. They also needed to be able to survive a bad sector on an SSD. The solution was to use GCP's Local SSDs for low-latency reads while still writing to the Persistent Disks to take advantage of snapshotting and redundancy via replication.

After faffing around with various caching options in software (Discord runs Ubuntu on its database servers), the team settled on md and a tricked-out RAID configuration. RAID0 (which just splits raw data over disks – lose one, lose 'em all) was selected for the Local SSDs and a RAID1 (basically a mirror) between the Persistent Disk and RAID0 array.

The result was, more or less, the super-disk success hoped for, although Oakley noted there were some specific edge cases encountered in the cloud environment. "In retrospect," he said, "disk latency should have been an obvious concern early on in our database deployments.

"The world of cloud computing causes so many systems to behave in ways that are nothing like their physical datacenter counterparts."

Topics

Special Features

Vendor Voice

Resources

Systems

Discord details how it dodged latency with a super-disk made in the cloud

For when a GCP Local SSD is just not quite reliable enough

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Google will delete data collected from 'private' browsing

Google One VPN axed for everyone but Pixel loyalists ... for now

Google joins the custom server CPU crowd with Arm-based Axion chips

A different view from the edge

Google location tracking deal could be derailed by politics

Google sues app devs, claims they're Play Store crypto scammers with 100k+ victims

Google will pump more than $100B into AI, says DeepMind boss

Japan turns up heat on Apple, Google with threat of hefty fines

FYI: This site claims to have harvested 4B+ Discord chats, today all yours for a price

AI spam is winning the battle against search engine quality

Google plunks down $1 billion for extra Japan-US submarine cable

Google bakes new cookie strategy that will leave crooks with a bad taste

About Us

Our Websites

Your Privacy