Database daddy goes non-relational on NoSQL fanbois
Postgres and Ingres father Michael Stonebraker is answering NoSQL with a variant of his relational baby for web-scale data — and it breaks some of the rules he helped pioneer.
On Tuesday, Stonebraker’s VoltDB company is due to release its eponymous open-source OLTP in-memory database. It ditches just enough DBMS staples to be faster than NoSQL while staying on the right side of the critical ACID database compliance benchmark for atomicity, consistency, isolation and durability of data.
VoltDB claimed that the database is between five and ten times faster than NoSQL’s Cassandra on a Dell PowerEdge R610 cluster based on Intel’s Xeon 5550, and it said that VoltDB is 45 times faster than an Oracle relational database on the same hardware, with near-linear scaling on a 12-node cluster.
VoltDB is the fruit of the H-Store-project, a collaboration between MIT (Stonebraker’s academic home), Brown University, Yale University and Hewlett-Packard Labs that sought to build a next-generation transaction processing engine. An early H-Store prototype could out-perform a commercial DBMS by a factor of 80 on OLTP workloads, it was claimed.
VoltDB co-founder and chief technology officer Stonebraker ran afoul of web data movement while brewing H-Store, when he compared Google’s MapReduce to RDBMS and dared to say it was lacking.
He and computer science professor David DeWitt were pilloried by Google fanbois for “not getting” data in the cloud, after the two of them said MapReduce ignored many of the developments in parallel DBMS technology over the last 25 years.
It now seems that Stonebraker — the main architect on Ingres and Postgres and an adjunct professor of computer science at MIT — has answered the fans by reaching for the cloud without offering yet another NoSQL variant to an already crowded field. VoltDB is currently under test with 200 beta customers, and it has one paying user.
Like NoSQLers, VoltDB has bumped its speed by moving data into memory and off disc, and dumping features such as logging, locking, latching and buffer management that hinder performance and scalability in big deployments. VoltDB also adds distributed data partitioning for speed, with performance further helped by multi-core processors on chips and the additional memory in today’s servers.
To retain ACID compliance, VoltDB uses single-threaded partitions that run autonomously while the data itself is replicated in a cluster for high availability.
VoltDB vice president of marketing Andy Ellicott told The Reg: “The challenge is to remove all the overhead, but — unlike the NoSQL key value store guys — keep ACID properties, to make sure the database maintains integrity of data automatically and make sure the database can be accessed by the SQL.
“We will work between partitions in an ACID way — that separates us from the sharders, as opposed to writing roll-back logic. It’s true ACID.”
While VoltDB has followed the NoSQL key-value crowd by ditching DBMS features, it has held on to the relational data model. It has also hidden the underlying complexity of VoltDB that programmers would have encountered by writing their software to different clusters. The problem with other projects is that by dumping the relational model, they're forcing architects to find new ways of building their applications.
Programming to VoltDB is done through Java stored procedures, while VoltDB routes queries to different servers without the programmer’s application needing to know the topology of the cluster.
VoltDB is available as a community edition (free under a GPL license) and a commercial license, which costs $15,000 for a four-server cluster on an annual subscription. That price buys you roadmap updates, patches, and services. Each additional server is priced at $3,000. Updates to VoltDB are planned within the next year to let you automatically add clusters and repartitioning. ®