This article is more than 1 year old

Databricks snaps up MosaicML to build private, custom machine models

Acquisition means for both parties get a shot at leading the roll-your-own AI market

Analysis Databricks has announced it will acquire generative AI startup MosaicML for $1.3 billion, in a deal that will make it easier for private entities to train and run their own custom machine learning models.

The takeover appears to be a logical step to enable growth for both parties. Databricks' core platform helps customers store and sort incoming data from different sources in their own cloud clusters, while MosaicML offers tools to spin up custom AI models at low cost.

Together the two outfits have the technical infrastructure and expertise to attract large businesses that want to use their own data to train and deploy generative AI systems. Many small to medium enterprises want to adopt machine learning, but are apprehensive about turning to off-the-shelf models built by private companies.

Above all, they don't want to share proprietary information with potential competitors, and are unsure about using their models if they don't know exactly how they will behave. OpenAI and Google, for example, don't disclose exactly what data is used to train their models and it's not outside the realm of possibility that they could act in ways that are difficult to predict.

"Whatever data is used to train a model, that model now represents that data and so wherever the model weights go, the data has gone. So for enterprises to really start to adopt these capabilities, model ownership has to be respected to respect their data privacy balance," Naveen Rao, CEO and co-founder of MosaicML previously told The Register

Databricks and MosaicML offer certainty over which data is used to train a model. The acquisition means that Databricks will have better AI resources, while MosaicML will gain access to better data with which to build custom private models. We note that MosaicML, formed in 2021, has 64 staff; Databricks was formed in 2013, and has more than 4,000 onboard.

Over the last couple of months, MosaicML has released two open source large language models – the MPT-7B and the MPT-30B – that it claims demonstrate that training can cost hundreds of thousands of dollars instead of the multi-millions needed by alternative model-makers.

Smaller models aren't as generally capable as something like GPT-4, but businesses don't necessarily need that. Many want a system that performs well at specific tasks, and will gladly choose smaller models if they can have more control over the system and reduce development costs.

"The economics have to be favorable. It really comes down to how optimized you can make it," Rao told The Reg.

"Every time you type into ChatGPT, it's doing an inference call and it's spitting out words. Each one of those is basically running on a web server of eight GPUs. That server costs $150,000, approximately, to build. So there's a real hard cost to this," Rao said. 

"Really optimizing that stack – hacking multiple requests together and utilizing the hardware effectively – is the name of the game. It's much more about making it super efficient so that you're not unnecessarily wasting GPU time on a smaller scale per request."

The custom AI market is heating up. By working under Databricks, MosaicML will be able to go after larger customers in more diverse industries.

Databricks already claims to work with over 10,000 organizations worldwide. With the addition of MosaicML's tools, it has a seemingly better machine-learning pipeline to attract businesses from its rivals and retain customers as they increasingly invest in AI.

Databricks said it hopes to close the deal in July, and that all MosaicML employees will move over to the combined biz. ®

More about

TIP US OFF

Send us news


Other stories you might like