AI + ML

This article is more than 1 year old

Databricks wants one tool to rule all AI systems – coincidentally, its own MLflow tool

Turns out people are not that great at tracking thousands of variables

Fri 7 Jun 2019 // 18:12 UTC

American upstart Databricks, established by the original authors of the Apache Spark framework, reckons its open-source machine-learning management engine MLflow is ready for prime time.

The released version 1.0 of the platform focuses on core API components. It improves the handling of metrics and search functionality, and adds support for Hadoop as an artifact store, in addition to the previously supported Amazon S3, Azure Blob Storage, Google Cloud Storage, SFTP, and NFS.

It also adds an experimental Open Neural Network Exchange (ONNX) model flavour, and a CLI command for building a Docker image capable of serving an MLflow model.

And finally, there’s Windows support for the MLflow client – in the unlikely event data scientists decide to opt for something other than Linux.

MLflow enables data scientists to track and distribute experiments, package and share models across frameworks, and deploy them – no matter if the target environment is a personal laptop or a cloud data centre.

The company launched the alpha version of MLflow project last year at the Spark + AI Summit.

Multiple code approaches

The basic machine learning life cycle – taking raw data, preparing it, training your model and deploying it – is full of variables and fraught with complications. It can involve hundreds of different open source tools and frameworks, each with dozens of configurable parameters.

Facebook, Google and Uber have all built their own proprietary tools to deal with this complexity.

MLflow was designed to take some of the pain out of machine learning in organizations that don’t have the coding and engineering muscle of the hyperscalers. It works with every major ML library, algorithm, deployment tool and language.

Databricks launches open-source project to drain all your data swamps into info lakes

One of the project’s goals is to improve collaboration between data scientists and engineers that deploy their creations in production.

In a true open source fashion, MLflow users didn’t wait for a stable release to start experimenting: Databricks says the platform has already been deployed at thousands of organizations to manage their machine learning workloads, and the company is offering it as a managed service.

Group effort

Databricks might have started the project, but today, it has more than 100 contributors, including a few from Microsoft.

"People are excited about having an open-source project in this space," Mattei Zacharia, co-founder and chief technologist of Databricks, told El Reg last year.

"They're excited about having an ML platform – it's something that resonates with them, and that many wanted to build already – and having one that is a community effort will be much better than what any company could build on its own."

The next major addition to MLflow will be a Model Registry that allows users to manage their ML model’s lifecycle from experimentation to deployment and monitoring.

You can find full release notes on GitHub, along with the project’s code base. ®

Topics

Special Features

Vendor Voice

Resources

AI + ML

Databricks wants one tool to rule all AI systems – coincidentally, its own MLflow tool

Turns out people are not that great at tracking thousands of variables

Multiple code approaches

Databricks launches open-source project to drain all your data swamps into info lakes

Group effort

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Microsoft shrinks AI down to pocket size with Phi-3 Mini

Stability AI decimates staff just weeks after CEO's exit

Why making pretend people with AGI is a waste of energy

Protecting distributed branch office environments from ransomware

Arm flexes silicon muscles to push generative AI at the edge

Developers are calling the shots on AI planning, judging by your experience

Databricks claims its open source foundational LLM outsmarts GPT-3.5

Belgian beer study acquires taste for machine learning

CNCF boss talks 'irrational exuberance' in an AI-heavy Kubecon keynote

New York Times: OpenAI’s claim we 'hacked' its products both 'irrelevant' and 'false'

Nvidia rival Cerebras says it's revived Moore's Law with third-gen waferscale chips

Can AI shorten PC replacement cycles? Dell seems to think so

About Us

Our Websites

Your Privacy