MongoDB developers 10Gen tool-up NoSQL database
Polish paid-for version, fix flaws, pray for wider adoption
MongoDB has been given a search engine, analytical features, and broader reporting capabilities as steward 10Gen tries to make the NoSQL database more accessible.
Version 2.4 of the open source NoSQL database became generally available on Tuesday. The additions to the platform deal with some of the usability criticisms that the popular technology has faced.
These features – an integrated search engine that means developers don't have to bolt-on Apache Solr or Elastic Search, hash-based sharding for better read and write distribution, a variety of onboard analytics, and geospatial support – are intended to help 10Gen give MongoDB a greater role in organizing and storing vast datasets.
"My personal feeling is that enterprises are going to congregate around just a few databases, NoSQL and RDBMS," 10Gen's vice president of corporate strategy (and former Reg columnist) Matt Asay told us. "It's simply too hard to manage a wide array of niche technologies that may do one thing incredibly well, but aren't useful for a broad variety of applications."
Asay's opinion chimes with a view held by some developers that companies have adopted too many NoSQL platforms, and are now paying the price in terms of added complexity and expensive developer salaries.
And if companies are going to consolidate their NoSQL systems, MongoDB looks like a good candidate – it's the most popular NoSQL system according to DB-Engines, which ranks databases according to internet-chatter, job listings, LinkedIn mentions, and Google Trends results.
The technology has also been blessed by Rackspace, which acquired MongoDB-as-a-service (MDBaaS?—Ed.) start-up ObjectRocket in February to help it take on Amazon Web Services.
For MongoDB developers, the 2.4 features broaden the remit of the technology and also deal with two of its major problems, such as the difficulty devs find in anticipating how much memory will need to be assigned to frequently accessed information, or "hot data", and MongoDB's predilection for enthusiastically failing-over during brief network brownouts.
Version 2.4 deals with the overflow problem by letting the database estimate the working set size of the hot data. This helps devs set aside RAM for the database before loading the set in, and can head off problems caused by not having enough memory.
Tweaks have also been made to how the system works during network instability to avoid expensive fail-overs during network blips.*
10Gen also polished the paid-for version of the software, MongoDB Enterprise, which has been given role-based privileges and on-premises monitoring across over 100 operational metrics. The company has over 600 customers, including Cisco, Craigslist, McAfee, Salesforce.com, and Telefonica.
In the past half-decade there's been a proliferation of new NoSQL companies, but all frothy waves must eventually recede. We reckon that the industry is heading into a period of consolidation.
NoSQL systems that don't receive enterprise adoption – and a resultant cash lifeline to keep their stewards going – will likely wither and die. Version 2.4 sees 10Gen trying to reduce the likelihood of MongoDB being one of these systems. ®
All MongoDB clusters have a primary node whose job is to receive writes and percolate them out to secondary nodes. If a primary node server dies, then the remaining nodes elect a new primary. This cut-over process takes a few seconds, during which time the system can't accept writes. If hardware fails, this can be a useful feature, but if network instability causes a primary node to disappear and then reappear a few seconds later the cut-over can do more harm than good.
"If your network is going up and down every five seconds ... failing over doesn't help, it only hurts," 10Gen cofounder Dwight Merriman says. "It would be better if you didn't get the failover behavior."
10Gen has fixed this by instituting some heuristic reasoning on the nodes to let them figure out when the reason a node has disappeared is likely due to networking problems and not for major failures. This helps reduce administrator time when the network is noisy by avoiding the cut-over.