Databases

This article is more than 1 year old

Poor Meta. Technical debt and user training made its exabyte-scale data migration tricky

Welcome to the real world, kids. And for the rest of us, a future at which Meta is gulp! – better at large-scale analytics

Fri 27 Jan 2023 // 06:28 UTC

Here’s one from the “welcome to the real world, kids, we have no sympathy for your plight” files: social media giant Meta’s engineering team has bemoaned the complexity of migrating from legacy technology.

In a Thursday post detailing migration of exabyte-scale data stores to new schemas, a quartet of Meta software engineers offered the following insight into their work.

Migrations are hard. Moreover, they become much harder at Meta because of:

Technical debt: Systems have been built over years and have various levels of dependencies and deep integrations with other systems.
Nontechnical (soft) aspects: Walking users through the migration process with minimum friction is a fine art that needs to be honed over time and is unique to every migration.

Fellas, we’re going to let you in on a secret: everyone gets technical debt, and everyone has trouble educating users about new systems.

Meta is not special. It is not a beautiful and unique snowflake. It is the same sort of decaying collection of cobbled-together tech that every other organisation accrues over time.

In this case, the decrepit tech was “numerous heterogeneous services, such as warehouse data storage and various real-time systems, that make up Meta’s data platform — all exchanging large amounts of data among themselves as they communicate via service APIs.”

As Meta detailed in December 2022, those systems struggled to scale as the data-harvesting giant built more AI workloads that needed to access data from diverse sources.

Improved data logging and serialization was the answer, so that data could describe itself more effectively and therefore be more easily ingested by diverse applications.

Meta built a system called “Tulip” to sort that out. And was chuffed that the formats it used required 40 percent to 85 percent fewer bytes and uses 50 percent to 90 percent fewer CPU cycles.

As Meta’s Thursday post explains, Tulip may have been top tech but making it work was hard, not least because the social media giant employed over 30,000 logging schemas.

Across the four-year effort to adopt Tulip, Meta engineers found some data wasn’t able to be easily ingested or converted, or that doing so was computationally expensive. Some tools designed to ease migration created problems as they ran, so engineers created rate limiters so that issues didn’t snowball.

And then there were those pesky users, whose role planting Tulip in Meta’s tech garden necessitated the creation of a migration guide, an instructional video, plus a support team.

“Making huge bets such as the transformation of serialization formats across the entire data platform is challenging in the short term, but it offers long-term benefits and leads to evolution over time,” the post winds up.

“Designing and architecting solutions that are cognizant of both the technical as well as nontechnical aspects of performing a migration at this scale are important for success,” the post adds. “We hope that we have been able to provide a glimpse of the challenges we faced and solutions we used during this process.”

Meta’s four engineers probably have offered useful insights for those who face similar data-wrangling challenges. The rest of you who have lived through legacy migrations? Maybe less so.

And for everyone else, the insight here is that Met has become more efficient at wielding exabytes of data. Much of it gathered from, and about, you. ®

Topics

Special Features

Vendor Voice

Resources

Databases

Poor Meta. Technical debt and user training made its exabyte-scale data migration tricky

Welcome to the real world, kids. And for the rest of us, a future at which Meta is gulp! – better at large-scale analytics

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

EU tells Meta it can't paywall privacy

Meta lets Llama 3 LLM out to graze, claims it can give Google and Anthropic a kicking

Next-gen Meta AI chip serves up ads while sipping power

Protecting distributed branch office environments from ransomware

Watchdog tells Dutch govt: 'Do not use Facebook if there is uncertainty about privacy'

Europol now latest cops to beg Big Tech to ditch E2EE

FYI: This site claims to have harvested 4B+ Discord chats, today all yours for a price

Meta comms chief handed six-year Russian prison sentence for 'justifying terrorism'

Meta accused of snarfing people's Snapchat data via traffic decryption

Grafana Labs updates observability line-up with query-less visualization

Netherlands arm of KPMG fined $25M for cheating in exams

Lawsuit claims Meta hobbled Facebook Watch to help Netflix

About Us

Our Websites

Your Privacy