Sponsored Over the last two decades, enterprises have gotten datacenter management down to a fine art. Standardization and automation means improved efficiency, both in terms of raw compute and of power consumption. Technologies such as virtualization and containerization mean users and developers can make more efficient use of resources, to the point of enabling self-service deployment.
However, the general purpose x86 architectures that fuel modern datacenters are simply not appropriate for running AI workloads. AI researchers got round this by repurposing GPU technology to accelerate AI operations, and this is what has fuelled the breakneck innovation in machine learning over the last decade or so.
However, this presents a problem for enterprises that want to run AI workloads. Typically, the CPU host plus accelerator approach has meant buying a single box that integrates GPUs and x86-based host compute. This can start alarm bells ringing for enterprise infrastructure teams. Although such systems may theoretically be plug and play, they can take up a lot of rack space and may impose different power and cooling requirements to mainstream compute. They can also be inflexible, with the ratio of compute to GPU accelerator being fixed, limiting flexibility when juggling multiple workloads with different host compute requirements.