Four internet giants have teamed up to create a branch of the MySQL database that's designed to handle massive web applications.
The open-source WebScaleSQL branch of MySQL 5.6 was announced by Facebook on Thursday, and uses version 2 of the GNU General Public License. Engineers from Google, LinkedIn, Twitter, and Facebook have contributed to the project, although the group is inviting other interested parties to join as well.
WebScaleSQL pulls together internal patches programmers at each of the aforementioned companies had developed and applied to the database's C/C++ source code – patches that have to be tweaked and reapplied whenever a new major version of the db engine is prepped for deployment, which is a bit of a pain.
WebScaleSQL also allows the teams to pool together similar new features they've worked on separately. The performance tweaks include better buffer pool flushing, the option to request sub-second client timeouts, and so on.
"We have all been maintaining our own branches for years. Many of the things in these branches take significant effort to rebase onto newer upstream releases, and we've all been doing that separately," explained Steaphan Greene, a Facebook software engineer, in an email to The Reg.
"For this expanding list of changes we have in common, [with WebScaleSQL] we will only have to do this once, for all of us, and not once each. We also have technical expertise and resources distributed widely between these various companies, and can do much more when working together.
"The buffer pool flushing improvements that are now included in WebScaleSQL came about because of Facebook's production experience on MySQL 5.6 and Twitter's expertise on the relevant code. Also, both Facebook and Google (and possibly others) maintain different versions of a massive code change supporting Table Stats. Now we'll only need to maintain one of these.
"And, as with all open software, the more expert eyes on the same code, the better for ensuring quality and reliability."
The group has also added an asynchronous database client, which means that while querying MySQL, a new logical read-ahead mechanism that the team says gives a significant improvement in table scans.
"Large-scale deployments of MySQL bring a unique set of challenges, and we launched WebScaleSQL with other companies like Google, LinkedIn, and Twitter who all face similar challenges, so that we can further customize MySQL for our needs," a Facebook spokesperson told us via email. "MySQL 5.6 has some great features, and we'll be working together to leverage these features and add others that are specific to scale-oriented companies."
Facebook "developed the basic framework" for WebScaleSQL, the spokesperson told us, Google reviewed it, and suggested some further changes, LinkedIn also reviewed it, and Twitter "contributed several performance improvements".
These four companies deal in terabytes to petabytes of user data and, though many people look down on the basic features of the Oracle-owned MySQL, the wisdom of the web giants seems to be that it's better to have a simple relational DB running at scale on commodity hardware, rather than something more complicated.
The WebScaleSQL site says the companies "reached a consensus that MySQL-5.6 was the right choice... We will continue to revisit this decision as the ecosystem evolves."
As El Reg reported in September, Google is migrating all of its internal systems over to MySQL competitor MariaDB, so we imagine that the search giant's interest in the WebScaleSQL tech is for improving the performance of its consumer-facing CloudSQL service rather than its internal systems.
Binary executables will not be provided, a WebScaleSQL FAQ tells us, as "the focus of WebScaleSQL is to provide a common set of code changes that groups can use as a base to apply further changes that are relevant to their use case."
In other words, feel free to get involved, but it's not for newbs running a Wordpress ripoff. ®