What do we want? Strong consistency! When do we... oh, it's in Riak v2
NoSQL datastore flexes muscles to woo enterprises
RICON West 2013 Riak-steward Basho has spliced crucial enterprise features into the second version of its NoSQL distributed database, and also admitted that its system can't do everything on its own.
The technical preview of version two of the software was released at the company's RICON West conference in San Francisco on Tuesday, bringing with it an option for strong consistency, better access-control policies, advanced security, and mix-and-match replica allocations.
What will make enterprises salivate, we reckon, is the arrival of strong consistency.
Riak previously offered just eventual consistency, which is relatively fast but the precise value of an accessed object can be uncertain for a small amount of time. Now the system also provides strong consistency, which guarantees the integrity of every transaction but it's a relatively slower process.
Simply put, you'd use eventual consistency for retrieving prices for stuff in an online marketplace, for instance, but use strong consistency when calculating the customer's total at final shopping basket checkout.
Consistency is crucial to distributed databases, and is one of the three key elements in Eric Brewer's CAP theorem, which states that databases can have any two of consistency (C), availability (A) and partitioning (P), but never all three.
Riak had previously had the A and P parts, but now it can have the C, for some workloads some of the time, and with caveats.
"You can choose eventual or strong consistency," Basho's chief technology officer Justin Sheehy says. "Nobody gets to beat CAP, but for a named subset of your data you can choose to have a very different mechanism used for propagating writes and reads to replicas."
This approach may increase overall latency, but will give enterprises the ability to have strong guarantees when accessing a subset of their data.
"In all the cases where Riak would normally accept a conflicting value, instead all but one of those conflicting values [will] fail loudly back to the client," Sheehy explained.
Conflict-free replicated data types
To support the use of Riak as a distributed data store, the company has also added in distributed data types – 'sets', 'registers', 'flags', and 'maps' – based on research into Conflict Free Replicated Data Types. This, the company says, should "enable developers to spend less time thinking about the complexities of vector clocks and sibling resolution and, instead, focusing on using familiar, distributed data types to support their applications’ data access patterns." Further information on the new technology is available in this document on Basho's GitHub pages.
The company has also made various tweaks around usability, including shifting configuration management away from Erlang literal syntax to a standardized
syscontrol file format.
This should make the database easy to maintain even for people not directly familiar with it, Sheehy said.
"Someone that's done much of any administration of their servers they will immediately understand it and edit it without any confusion," he said.
Riak's replication model has been given a tuneup, the company said. Where previously IT departments needed to store three copies of their data in every data center, they can now change this number according to their needs.
This lets sysadmins store fewer or more copies of replicated data across multiple facilities, and they can mix and match the amount data replicated as required. For instance, if a business has a multitude of co-location facilities around the world then it may want to have three replicas in its primary facility and single copies in others.
Though Basho has added several features to the new version of Riak, it has also stepped back from others, and Riak 2 will see the company offer full search integration with the Apache SOLR project, rather than do its own search tech.
"Some of the best people in the world have been working on Apache SOLR for years," Sheehy said, then noted that as Riak has expertise in distributed systems and databases it would be "too much hubris" to try for search as well.
Though Riak originally began life as a clever implementation of some of the ideas found in Amazon's seminal Dynamo paper, it has since grown into a full-fat database with good reliability properties, and excellent traits for the backing store of a distributed system.
"We started out providing all of our guarantees at the plumbing layer. The earliest versions had an incredibly spartan user interface and we've improved that overtime," Sheehy said. "We built from the bottom up instead of the top down." ®