MongoDB wants to grab work from other databases
Goal is to bring analytics to its transactional workloads but there are places where this might fall short
Analysis At MongoDB's recent conference in New York, the company demonstrated its ambition in taking on workloads from other databases.
The company has made significant inroads into the database market with a developer-friendly distributed document database to help devs build modern, web-based, transactional systems.
Time series and search have become targets, with the promise of support for secondary indexes in the former, and Search Facets to help developers build search experiences more rapidly in the latter.
But it was the continued push into analytics that impressed commentators, who were also keen to point out the limits to what could be achieved in a document database.
Available later this year, column store indexing will help developers create and maintain a purpose-built index that dramatically speeds up many common analytical queries without requiring any changes to the document structure or having to move data to another system, the company said.
Speaking to The Register, MongoDB chief product officer Sahir Azam said developers were often forced to aggregate data in a third-party system, and then bring it back into their database to kind of operationalize complex analytical queries as part of their application.
"We've added a slew of capabilities into the database and [DBaaS] Atlas to make it easier to enable in app application experiences," he said.
"We're seeing a lot more richer or smarter application experiences where what would typically be a human making a decision and off of a Tableau dashboard, it's now something that a development team is automating in software.
"But those queries are often very different than what you would think of a traditional kind of transaction. They're much more like an analytical-style columnar query as opposed to a transactional query. So, we have been working on performance improvements in our query engine, a new indexing type called a column store indexing, which is all about improving the performance for complex analytical queries so that they can be embedded in the application experience," he said.
Testing with synthetic data and real customer workloads had shown MongoDB improve its performance on complex analytical queries from 5x to 200x, he said. Applications might include fraud analysis in financial services, next-best offer in ecommerce, or supply chain management, he said.
'They're attempting to make so the database performance is not negatively impacted by analytics'
Kimberly Wilkins, MongoDB technical lead with database consultancy Percona, said that in executing heavy analytics in MongoDB — even running it in a separate node — developers would have previously seen a negative impact on performance.
She said there had been significant improvements in its sync capabilities, and it also now allows larger analytics numbers in replica sets than for other replica set numbers used for writing and for irregular reads. "That's a huge thing they've been able to do. They're attempting to make so the database performance is not negatively impacted by analytics, so you can use run your heavy analytics against MongoDB," she said.
Even so, developers and data architects who have set off building a cloud data warehouse such as Snowflake or AWS Redshift for analytics alongside MongoDB are unlikely to change their minds because of MongoDB's enhancements. It may affect future decision-making, however, Wilkins said.
"If people are starting to think they need a little bit of analytics, but they've got a document database and it's going to kill their write performance and their read performance, that's not going to be the case anymore, if they do it right with MongoDB. It's actually very, very impressive," she said.
Tony Baer, principal at analyst firm dbInsight, described MongoDB's move into analytics as "taking baby steps" – allowing lightweight queries without impacting operational performance with important limits to its usage.
"The first principle in operational databases, is the last thing you want to do is slow it down. To do all this complex modelling that you would do in Databricks or complex analytics that you would do in Snowflake, you really do not want to burden the operational database with that, and that's not what it's meant for, event though you can partition the load in [MongoDB DBaaS] Atlas and have separate nodes. What it is meant for is, you can make a smart decision on the spot," he said.
Speaking on SiliconAngle's The Cube, he said similar ideas were behind Oracle's move with MySQL Heatwave and Google's AlloyDB.
- MongoDB announces columnstore indexing for its document database
- That critical vulnerability might not be the first you should patch
- Red Hat adds more Cloud Services to support OpenShift apps
- Google Cloud previews new BigLake data lakehouse service
Matt Aslett, VP and research director with Ventana Research, said cluster-to-cluster synchronization to sync data across clouds and on-premises clusters and data federation to query data across multiple clusters were among the most significant news from last week's event, adding to the momentum MongoDB had sustained with developers of modern applications.
"The company has done well in engaging with developers creating new web applications utilizing the document model and JSON format, especially for web applications.
"Although much of the company's early success was driven by internet and applications startups it is increasingly gaining traction with established enterprises in industries such as financial services, insurance, healthcare, and government, including being adopted for workloads that were historically the domain of relational databases," he told The Register.
However, the company managed to muddy the waters on what it was commercially supporting in the current release. "Some of the announcements related to generally available features, and some of which were previews of forthcoming functionality. This enables users to start developing to take advantage of imminent features, but the large number of announcements does have the potential to cause confusion in relation to whether individual features are commercially supported or not," Aslett said.
MongoDB has ambitious expansion plans and hosted a glitzy event in New York last week, even though it remains loss-making. Yet investor confidence is buoyed by the company’s rapid growth, Aslett said.
"MongoDB's revenue is not only continuing to grow, but year-on-year quarterly revenue growth accelerated during its fiscal 2022. I would expect it to retain the confidence of investors as long as it continues to meet or exceed expectations," he said.
If that remains the case, it could be among the handful of NoSQL startups to see out a vision to bring analytics and operational workloads closer together. ®