VMware and Nvidia buddy up to integrate AI containers with vSphere, promising results as 'fast as public cloud'

But no faster, so do you need it?

VMworld VMware says it is integrating Nvidia's GPU Cloud (NGC) into its vSphere and VMware Cloud Foundation platform to tap into the growing demand for AI chops in the enterprise.

The NGC integration is the second part of what looks like a significant tie-up between Nvidia and VMware. Virtzilla yesterday unveiled Project Monterey, which uses the Nvidia Bluefield-2 DPU (Data Processing Unit) for programmable SmartNICs and storage controllers, offloading the ESXi hypervisor from the server CPU to the SmartNIC.

A further implication is that with ESXi running on a SmartNIC, the VMware management framework can manage bare-metal servers as well as virtual machines. "The decoupling of networking, storage, and security functions from the main server allows these functions to be patched and upgraded independently from the server," the company said.

The AI piece is separate and centred on Nvidia NGC. This is a catalogue of frameworks, containers, Helm charts (which define Kubernetes applications), and models for AI solutions running on Nvidia GPUs. Another piece is Nvidia Clara , an application framework aimed at AI in healthcare, including medical imaging, genomics, and programming sensor-based solutions for smart hospitals.

NGC components run anywhere where there is a suitable Nvidia GPU, though there is also a hardware certification scheme called "NGC-Ready" to give assurances that everything will work right. NGC products can also be found in the marketplace for various cloud providers, including AWS, Google Cloud Platform, Microsoft Azure, Oracle, and Alibaba. Nvidia does not charge for NGC since it makes its money on selling the GPUs required to run the software. NGC will also integrate with Tanzu, the VMware application platform.

Both VMware and Nvidia are betting that AI usage will continue to grow. Examples include ad personalisation, image classification, fraud detection, support chatbots, anomaly detection in manufacturing, assisting medical diagnosis, supply chain optimisation, failure prediction, demand forecasting, and increased automation across almost every industry.

Nvidia CEO Jensen Huang and VMware boss Pat Gelsinger at VMworld

Nvidia CEO Jensen Huang and VMware boss Pat Gelsinger at VMworld

If it is already easy to run NGC applications, where does VMware fit in? That is a key question and it is all about Virtzilla carving a new space for itself as enterprises migrate from vSphere-based systems on-premises to containers and/or to public cloud, and bringing a more cloud-like experience to on-premises data centres.

Virtualized GPUs can have as little as a 4 per cent overhead versus native GPUs, according to a presentation at VMworld, and have the same kind of benefits as virtualizing CPUs, enabling flexibility and higher utilisation. Virtual GPUs can be shared between VMs, especially with Nvidia A100 GPUs, which have specific multi-instance support, and virtual GPUs can be scheduled, for example, so that GPU-intensive workloads take place at night and interactive CAD (Computer Assisted Design) during the day. While virtualizing Nvidia GPUs on vSphere is nothing new, integration with NGC will simplify use of these pre-baked models and frameworks within this environment.

That said, there is an element of catch-up here as VMware strives to compete with what is already available in public cloud, in terms of the ability to run and scale AI applications. Nvidia promises, via the partnership, "a platform that delivers AI results fast as the public cloud", as if the main benefit is for those who for some reason (perhaps compliance) prefer to keep workloads on-premises. A "multi-year partnership" is promised, and there is an early access program for the impatient. ®

Biting the hand that feeds IT © 1998–2021