After an Apple engineer called an 11th-hour halt to the release, Cassandra 4.0 has finally launched flaunting newfound stability, speed and consistency, according to the open-source project's users and contributors.
The code for wide-column database – which has been popular as a distributed system with users including Apple, Instagram and eBay – officially went live today, around six years after 3.0's debut.
The developer community is said to have invested the time in wanting to make this the most stable release of the NoSQL system, and ship with no known bugs.
Speaking to The Register in the run-up to launch, Vinay Chella, engineer and cloud data architect at Netflix, said the new model for streaming data between nodes made it between four and five times faster, accelerating the recovery from failed nodes, and reducing costs.
"The non-blocking I/O based streaming… essentially replaces single-threaded or sequential synchronous blocking model to stream the data between the nodes. What that means is when a node fails, the time to react and remediate node failures is improved. In the past, if it takes five minutes, now it takes one minute, let's say," said Chella.
The higher speed also equated to increased efficiency as a deployment of an earlier iteration which might require 96 nodes could be deployed on 84 with Cassandra 4.0, he said.
Netflix has been using Cassandra since 2013, replacing Oracle databases and using the NoSQL system to support global accounts and customer data worldwide.
- Everyone cites that 'bugs are 100x more expensive to fix in production' research, but the study might not even exist
- Cassandra 4.0 release held back after Apple engineer discovers last-minute bug
- Microsoft unveils its latest Cosmos DB lure for developers: More free stuff and an emulator for Linux
- Toshiba launches cloudy managed IoT database service running its own GridDB
For an organisation the size of Netflix, the greater efficiency in 4.0 was a "big deal," he said. "Data demands are growing. We keep scaling up the clusters. If we introduce Apache Cassandra 4.0, we probably don't need to scale up the cluster for the next few years, and moreover we can scale it down, and probably reduce our cloud bill."
Although not a feature per se, the increased stability of Cassandra 4.0 might give businesses more confidence in the database, particularly those without huge engineering teams.
"The Cassandra community, and the big companies around the world which are using Cassandra, have put countless hours of efforts testing Apache Cassandra 4.0 even before it is released internally in their environments, to ensure... it is production-ready," Chella said.
Who doesn't want to avoid those "nasty issues" rearing their heads in production environments when "petabytes of data, thousands of nodes, and hundreds of clusters" are at risk?
Determination to see it released as fault-free as possible saw Apple engineer and Cassandra contributor Jon Meredith request an extension to the release vote due to a possible issue "serializing FWD_FRM" just days before it was scheduled for release.
Carl Olofson, IDC research vice president, said the scalability, recoverability, streaming data support, and log analytics available in the new release might be welcomed by enterprise users. "The scalability improvement is of particular significance, especially considering the diverse deployment methods available on-prem and in the cloud," he said.
Cassandra 4.0 also promises enhanced security and observability, such as audit logging to track user access and activity. The aim is to help regulatory and security compliance with US laws like SOX and Europe's GDPR. New configuration settings and garbage collectors also feature.
Ben Bromhead, CTO at support and consultancy firm Instacluster and community member, said that with the list of "enterprise" features, he expects Cassandra 4.0 to become much more widely adopted. "You can see companies of all sizes [and] users of all sizes kind of picking it up and starting to run with it," he said. ®