Google advances with vector search in MySQL, leapfrogging Oracle in LLM support

Meanwhile, only 22% of orgs are looking at GenAI strategy for databases

Google has introduced vector search to its MySQL database service, surpassing Oracle – custodian of the open source database – which has so far failed to add the feature deemed an advantage in executing large language models (LLMs).

The Chocolate Factory announced vector search – in preview – across several Google Cloud databases, including Cloud SQL for MySQL, Memorystore for Redis, and Spanner, Google's distributed database management and storage service.

Andi Gutmans, vice president for databases, Google Cloud, said over the last 12 years, Google had been innovating quite rapidly with vectors.

Vectors are a foundational element of LLMs, which have become an obsessive focus of big tech, governments, and the media since ChatGPT launched in 2022. LLMs rely on words or other components of language being represented as vector embeddings according to their statistical similarity with other words. Google was behind Word2Vec, a technique for natural language processing launched in 2013, although it has become superseded by transformer architectures adopted by LLMs.

By introducing vector search to MySQL – ranked second in the market only to Oracle, according to DB-Engines – Google has overtaken Oracle's open source MySQL.

Dave Stokes, technology evangelist at open source database support business Percona, said Oracle engineering has no plans to support vectors or anything like a nearest-neighbor search for the community edition.

"Sadly, Oracle seems to be putting all its resources into HeatWave while doing the absolute minimum for the community edition," he said. "This will put MySQL further behind other options like PostgreSQL and new Vector databases. The general lack of new features and capabilities in the community edition while embedding JavaScript and vectors into the commercial version will make community customers seek other alternatives such as what Google is offering."

The Register has contacted Oracle to offer it the opportunity to respond.

Google is not the only vendor to add vector search to a MySQL service, though. PlanetScale, the MySQL/Vitesse-based distributed transactional system, announced the new feature in October last year.

Redis, the popular in-memory database often used as a cache and system broker, has promised vector search in coming releases.

Last week, Couchbase, the distributed document database, introduced vector search as a new feature in DBaaS Capella and Couchbase Enterprise Edition.

Scott Anderson, senior vice president of product management and business operations at Couchbase, said adding vector search to the platform is the next step in "enabling our customers to build a new wave of adaptive applications."

Last year, Oracle database, Cassandra, MongoDB, PostgreSQL, and SingleStore added support for vector search to their database systems, while a segment of specialist vector databases such as Pinecone have sprung up to support the computing trend.

Noel Yuhanna, Forrester Research vice president and principal analyst, said vector search was more or less standard now for any serious enterprise database.

"Those who don't have it will likely see an impact on their growth. Based on our research, about 35 percent of enterprises are looking at vector databases, which is expected to grow to 50 percent over the next 18 months," he said.

He said vector search was becoming critical for GenAI applications to help seek out for similar data, images, and documents with applications emerging in customer intelligence, fraud detection, chatbots, and content personalization.

While specialist vector databases have their advantages, integrated databases provide organizations with more context and richer data experience, Yuhanna said. "No vendor stands out since vector capabilities are still evolving, and many haven't demonstrated high-end scale."

However, only about 22 percent of organizations were looking at an LLM/GenAI strategy for their databases right now, although Forrester expected that to double in the next two to three years. "Most of the demand we see is for new GenAI apps that want to leverage vector for a new deployment; for existing databases to move towards vector, we are looking at least a few years away," Yuhanna said.

SQL - lake conceptual illustration

Cloudera adopts Apache Iceberg, battles Databricks to be most open in data tables


Google is also trying to bring its own GenAI model closer to its analytics environment. Google has said it is making Gemini accessible for users of BigQuery, its data warehouse system, via Vertex AI. The new integrations with the AI and ML platform are designed to help data engineers and analysts use Gemini models for multimodal and advanced reasoning capabilities for their BigQuery data.

Yuhanna said bringing Vertex AI, BigQuery, and BigLake closer together would help organizations not only avoid data movement but also help offer insights, improve data governance and security, remove redundant data, and lower costs by minimizing administration requirements.

He said it was part of the trend for enterprises to merge unstructured data with structured BI-style data in the so-called lakehouse concept now adopted by around a quarter of enterprises to lower cost and run BI, data science, AI/ML, operational insights, and SQL analytics on single platform. ®

More about


Send us news

Other stories you might like