Deep Learning Containers (DLC) has entered beta stage, according to Google's Cloud Platform team.
The service lets users you run up an instant machine learning environment - local or remote - pre-configured for popular frameworks like TensorFlow and PyTorch.
DLC are sold alongside Google’s existing Deep Learning VMs, a set of Debian-based disk images, complete with access to NVIDIA GPUs, for running ML frameworks. Currently there are 14 such VM images, including TensorFlow, PyTorch (an ML library for Python), R 3.5, and Intel MKL (Math Kernel Library) with CUDA, though it should be noted that half of the images on offer are marked Experimental.
At the time of writing there are 20 Deep Learning Container images including TensorFlow, PyTorch and R, with both CPU and GPU options. Others are to follow.
“We are working to reach parity with all Deep Learning VM types,” states the Google blurb.
The current list of Deep Learning Container images on offer
A nice thing about these images is that you do not have to spend any immediate money to run them locally. Each image is set up with a Python3 environment, the selected ML libraries, and a Jupyter server which runs automatically. There are both CPU and GPU options. All you need is a working docker setup, with nvidia-docker and a CUDA 10 GPU, if you want to run with acceleration via NVIDIA’s CUDA parallel computing platform. You install the gcloud SDK, pull the container image you want and run. The images are relatively large, with the TensorFlow image nearly 6GB, and PyTorch around 8GB.
Jupyter is an open source interactive tool for working with and sharing code, equations, visualisations and text, which has become a standard in the data science community.
Google’s hope is that you will want to use a “beefier machine than what your local machine has to offer”. You can customise your container as needed, for example to include your own python packages, and upload to the GCP container registry. Then you can deploy them to GCP, using one of several options for running containers, including Google Kubernetes Engine (GKE), Cloud Run (a serverless option), or Docker Swarm. And you can deploy to an AI Platform Notebook instance, where AI Platform is a managed Jupyter Notebook service.
There is also, as you would expect, integration with other Google services such as BigQuery, Cloud DataProc for Apache Hadoop and Apache Spark, and Cloud Dataflow for batch processing and streaming data using Apache Beam. The container-based approach and the fact that you can start by running locally makes Deep Learning Containers an easy way into GCP, and Google will soon profit if you are tempted to use its other cloud services. ML services consume lots of processing making them an attractive proposition for the various cloud providers.
Amazon Web Services (AWS) offers a mature set of ML services, such as Amazon SageMaker which includes pre-built Jupyter notebooks and a range of models in the AWS Marketplace for Machine Learning.
Similarly, Microsoft has been on this for a while in Azure, with Azure Machine Learning Workspaces guiding users through building Docker images, deploying models and creating pipelines.
The further implication is that if you have more than an occasional requirement it may be substantially cheaper to run on your own kit.
Google will be trusting that its affinity with Kubernetes for container orchestration will help it attract users with large scale projects®.