Tencent builds one NoSQL database to rule all data models

Tamed DB sprawl and saved cloudy resources with 'X-Stor'

Exclusive Chinese web giant Tencent has revealed it created a NoSQL database that it believes can handle multiple data models more elegantly than other attempts to do so, and has used it to consolidate its database fleet and improve resource utilization.

The existence of the database – named X-Stor – was recently revealed in a paper [PDF] published in the Proceedings of the Very Large Data Base Endowment, the journal of the non-profit organization that exists to promote and exchange scholarly work on databases and related fields.

The paper opens with observations that NoSQL databases are generally built to handle certain data models. Tencent admits it ran several of them to power its fleet of products – social networks, video streaming services, online games, and a public cloud – that collectively serve more than a billion active users.

Titled "X-Stor: A Cloud-native NoSQL Database Service with Multi-model Support", the paper reveals Tencent used graph databases to store info about user relationships for its social networks, wide-column stores to hold user profiles, document series databases to power its advertising operations, and time-series databases to record user behavior data.

That proved less than ideal because Tencent found it hard to support novel data models in existing systems – so sometimes needed to develop a new NoSQL system from scratch. Doing so meant rebuilding functions already found elsewhere – a wasteful overlap.

Like any hyperscaler, Tencent abhors under-used resources. The web giant was therefore not thrilled to learn that "deploying multiple heterogeneous databases at scale leads to system resources isolation for different NoSQL databases, which not only complicates maintenance but also hinders efficient resource sharing among clusters."

X-Stor addresses that issue – allowing the use of different data models by "extending the corresponding storage engine and data access interfaces within the X-Stor system." The independent storage engines "can fully support their respective data models, with performance comparable to that of their single-model counterparts."

The paper claims that's a more elegant arrangement than those used by rival NoSQL databases MongoDB, Redis, and ArangoDB, each of which has its own way of accommodating multiple data models.

X-Stor is serverless and runs as multiple microservices orchestrated by Tencent's own Kubernetes Engine. Tencent initially ran the database on hosts packed with fast SSDs to handle the needs of different data models, such as I/O-intensive key-value and time-series models. However, doing so saw under-utilization of memory in some SSD-equipped servers. X-Stor can identify which nodes have the resources needed to match a workload and the data model it employs, thus using each node to optimal extent.

Tencent's paper offers some dense math explaining how workloads compete for and are allocated resources – enjoy its equations if that's your thing.

The bottom line is that the Chinese giant built itself a database it claims can handle any data model – even entirely new ones – and which it has proven can scale to store 12PB for online operational data, 700 billion requests per day with a peak of 30 million requests per second, while handling more than 100,000 tables with multiple data models.

Sadly, it appears the database is not open source – so the rest of us can't take it for a spin.

China's hyperscalers are doing interesting things. We've recently reported Alibaba Cloud's hardware failure detection code, modular datacenter architecture, and an advanced Ethernet scheme that sees nine NICs installed in the servers it uses for AI model training. Huawei Cloud runs an advanced network health probe. Tencent found a way to halve WAN latency. ®

More about

TIP US OFF

Send us news


Other stories you might like