Google Cloud flexes as first to host Nvidia RTX PRO 6000 Server VMs

Baby got Blackwell GPUs

Google Cloud on Wednesday celebrated the debut of virtual machines incorporating Nvidia's latest Blackwell GPU technology, claiming to be the first cloud provider to sell this particular offering.

Nirav Mehta, VP of Google Cloud Compute Platform, and Roy Kim, Director of Google Cloud AI Infrastructure, shared word that G4 VMs based on Nvidia RTX PRO 6000 Blackwell Server Edition are coming soon.

"The G4 VM can power a variety of workloads, from cost-efficient inference, to advanced physical AI, robotics simulations, generative AI-enabled content creation, and next-generation game rendering," said Mehta and Kim in a blog post.

Google Cloud's virtualized version incorporates eight Nvidia RTX PRO 6000 GPUs, two AMD Turin CPUs, and Google Titanium offload processors.

Announced at Nvidia GTC this spring, the PCIe-based RTX Pro 6000 Server Edition is the spiritual successor to Nvidia’s aging L40 and L40S and is aimed at a combination of AI inference, model fine tuning, and data visualization workloads like digital twins.

Each accelerator is capable of churning out 3,753 teraFLOPS of sparse FP4 compute and is equipped with 96GB of GDDR7 with 1.6TB/s of memory bandwidth. Put together, the G4 VMs sport 768 GB of GDDR7 memory and 384 vCPUs with 12 TiB of Titanium local SSD, which can be expanded to 512 TiB of Hyperdisk network block storage. That’s 4x more compute and memory, and 6x memory bandwidth, than G2 VMs.

For AI inference workloads, more memory means you can run larger parameter count models, while higher bandwidth directly translates into higher throughput.

VMs introduced earlier this year, the A4 and A4X series, also rely on Blackwell GPUs but are better suited for AI training and inference, as they lack the graphics pipelines necessary for visualization or rendering workloads. As such, Google’s G4 VMs are a bit of a Swiss Army knife capable of running a wider variety of GPU-accelerated functions.

Vinay Kola, senior manager of software engineering at Snap, was tapped by Google Cloud to comment. "Our initial tests of the G4 VM show great potential, especially for self-hosted LLM inference use cases," he said in a statement. "We are excited to benchmark the G4 VM for a variety of other ranking workloads in the future."

Independent analysts see the move as both a flex and a signal of Google's cloud priorities.

"There are certainly some bragging rights here," said Crawford Del Prete, President of IDC, in an email to The Register. "That said, this also speaks to the priority Google is placing on addressing the needs of a wide set of customer workloads on GCP, at scale and in the performance envelope required by customers. The G4 VMs with Nvidia RTX PRO 6000 Blackwell Server will attract the interest of customers who want the flexibility to run (AI) CPU, performance storage, and memory intensive workloads in the cloud.

“For some customers, this approach will be attractive as Google is integrating many of its services into its ‘AI Hypercomputer,’ a system where they can offer a high performance cloud solution to customers requiring performance-oriented workloads. This attracts a customer set willing to pay up for that kind of performance.”

Google Cloud customers may have to moderate their excitement a bit, though, since G4 VMs are presently available as a preview, meaning some negotiation with a Google Cloud sales rep may be required to actually use one. Global availability is expected by the end of the year. ®

More about

TIP US OFF

Send us news


Other stories you might like