Cisco, Nvidia expand collab to push Ethernet into AI clusters
InfiniBand dominates in GPU-boosted servers while Big E gains steam
At Cisco Live in Amsterdam on Tuesday, the enterprise networking goliath announced a series of hardware and software platforms in collaboration with Nvidia tailored to everyone's favorite buzzword these days: AL/ML.
A key focus of the collaboration is making AI systems easier to deploy and manage using standard Ethernet, something we're sure all those who've gone through trouble getting their CCNA and/or CCNP certificates will appreciate.
While the GPUs that power AI clusters tend to dominate the conversation, the high-performance, low-latency networks required to support can be quite complex. While it's true that modern GPU nodes benefit heavily from speedy 200Gb/s, 400Gb/s, and soon 800Gb/s networking, this is only part of the equation, particularly when it comes to training. Because these workloads often have to be distributed across multiple servers containing four or eight GPUs, any additional latency can lead to extended training times.
Because of this, Nvidia's InfiniBand continues to dominate AI networking deployments. In a recent interview with Dell'Oro Group's enterprise analyst Sameh Boujelbene estimated that about 90 percent of deployments are using Nvidia/Mellanox's InfiniBand — not Ethernet.
That's not to say Ethernet isn't gaining traction. Emerging technologies, like smartNICs and AI-optimized switch ASICs with deep packet buffers have helped to curb packet loss, making Ethernet at least behave more like InfiniBand.
For instance, Cisco's Silicon One G200 switch ASIC, which we looked at last summer, boasts a number of features beneficial to AI networks, including advanced congestion management, packet-spraying techniques, and link failover. But its important to note these features aren't unique to Cisco, and Nvidia and Broadcom have both announced similarly capable switches in recent years.
Dell'Oro predicts Ethernet's role in AI networks to capture about 20 points of revenue share by 2027. One of the reasons for this is the industry's familiarity with Ethernet. While AI deployments may still require specific tuning, enterprises already know how to deploy and manage Ethernet infrastructure.
This fact alone makes collaborations with networking vendors like Cisco an attractive prospect for Nvidia. While it may cut into sales of Nvidia's own InfiniBand or Spectrum Ethernet switches, the pay off is the ability to put more GPUs into the hands of enterprises that might otherwise have balked at the prospect of deploying an entirely separate network stack.
Cisco plays the enterprise AI angle
To support these efforts, Cisco and Nvidia have rolling out reference designs and systems, which aim to ensure compatibility and help to address knowledge gaps for deploying networking, storage, and compute infrastructure in support of their AI deployments.
These reference designs target platforms that enterprises are likely to have already invested in, including kit from Pure Storage, NetApp, and Red Hat. Unsurprisingly they also serve to push Cisco's GPU accelerated systems. These include reference designs and automation scripts for applying its FlexPod and FlashStack frameworks to AI inferencing workloads. Inferencing, particularly on small domain specific models, are expected by many to make up the bulk of enterprise AI deployments since they're relatively frugal to run and train.
The FlashStack AI Cisco Verified Design (CVD) is essentially a playbook for how to deploy Cisco's networking and GPU-accelerated UCS systems alongside Pure Storage's flash storage arrays. The FlexPod AI (CVD), meanwhile, appears to follow a similar pattern, but swaps Pure for NetApp's storage platform. Cisco says these will be ready to roll out later this month, with more Nvidia-backed CVDs coming in the future.
- AMD crams five compute architectures onto a single board
- Think tank funded by Big Tech argues AI's climate impact is nothing to worry about
- Untangling Meta's plan for its homegrown AI chips, set to actually roll out this year
- Singtel does the 'we're building datacenters to host Nvidia clusters' thing
Speaking of Cisco's UCS compute platform, the networking scheme has also rolled out an edge-focused version of its X-Series blade systems which can be equipped with Nvidia's latest GPUs.
The X Direct chassis features eight slots that can be populated with a combination of dual or quad-socket compute blades, or PCIe expansion nodes for GPU compute. Additional X-Fabric modules can also be used to expand the system's GPU capacity.
However, it's worth noting that unlike many of the GPU nodes we've seen from Supermicro, Dell, HPE, and others, which employ Nvidia's most powerful SXM modules, Cisco's UCS X Direct system only appears to support lower TDP PCIe-based GPUs.
According to the data sheet, each server can be equipped with up to six compact GPUs per server, or up to two dual-slot, full- length, full-height GPUs.
This will likely prove limiting for those looking to run massive large language models consuming hundreds of gigabytes of GPU memory. However, it's probably more than adequate for running smaller inference workloads, for things like data preprocessing at the edge.
Cisco is targeting the platform at manufacturing, healthcare, and those running small datacenters. ®