AI agents promise big things. How can we support them?

And this is just the beginning

Sponsored feature If you thought that having ChatGPT create recipes based on what's in your fridge was cool, wait a while - what's coming next will make that seem decidedly retro.

That's the hope for AI advocates who are convinced that the next big thing is agentic technology. It's an evolution of AI that enables it to do far more complex, powerful things, and it has the market excited. Over four in five companies told IDC that AI agents are the new enterprise apps, and they're reconsidering their software procurement plans around this new technology.

All this is going to take a lot of AI models running concurrently to do well, and they'll all need managing. That's where AI-ready infrastructure from Nutanix aims to help.

A not-so-secret agent

So what is agentic AI? Early large language models (LLMs) focused on carrying out basic tasks that only human beings could formerly do. Transcribing text, suggesting recipes, and formatting spreadsheets are great applications, but LLMs lacked the depth to do lots of these tasks in succession for more complex outcomes. This is where agentic AI - built atop reasoning models - comes in.

Reasoning models go beyond just retrieving and remixing information, instead working through multi-step problems sequentially. They can often apply logic to novel situations, rather than ones they've seen before. This enables them to display what we call agentic behaviour.

Instead of acting like a simple tool, an agentic model will set sub-goals in the pursuit of a more complex goal, reflecting on its outputs along the way to ensure that they're correct and adapting to context in real time. They might also use tools such as software applications or online services to help them achieve their ends.

Let's say you want to analyse the fluid dynamics of an aerofoil wing and come up with some alternative designs to improve fuel efficiency. If you wanted to control every part of the project, you'd get out a calculator and allocate half a day. An LLM is the equivalent of that calculator. If you wanted someone trusted to do the groundwork for you, you'd ask a PhD student to handle it without worrying about the details. An agentic AI is the equivalent.

An agentic AI will use multiple LLMs for its reflections, says Debo Dutta, Chief AI Officer at Nutanix. "These large language models leverage traditional databases, storage, and some newer components like vector databases," he says.

The power of reasoning LLMs, combined with these underlying infrastructure tools, breathes new life into enterprise automation, he adds. "Now the large language models can do better decision-making and better planning." Such decisions could range from evaluating a customer complaint and advising on a best course of action, for example. "They’re pretty good at a lot of tasks for which traditional software was hard to write," Dutta observes.

A proliferation of models

It takes considerable resources to build and deploy agentic AI, especially as it becomes more complex.

Each agentic application usually employs multiple models simultaneously, tailored specifically for their respective roles, rather than using a single general-purpose model.

These include general thinking and inferencing for basic decision-making and reasoning tasks, embedding (which converts existing data into a format understandable by LLMs), and re-ranking. The latter prioritises and determines the relevance of search results within agentic workflows. Agents also usually require a model guard, which prevents models from generating offensive or inappropriate outputs​, he explains.

Dutta differentiates between models - the LLMs that power the AI - and endpoints. The latter are the APIs that applications will access to exploit the model's capabilities.

As these models proliferate, the processes involved in deploying and using them become more complex. That's compounded by the expense of running the models, which are compute-intensive, says Dutta. Cloud service providers charge for these models on a per-token basis, and their undisciplined use can quickly escalate costs.

What Nutanix offers

Nutanix focuses on software for efficient deployment of cloud technologies both on customers' premises, in cloud and multi-cloud environments, and in hybrid scenarios. The company offers Nutanix Enterprise AI, a unified platform designed to simplify, secure, and scale the deployment of large language models (LLMs) and agentic workflows across private, public, and hybrid cloud environments.

Nutanix Enterprise AI is the latest step in the company's journey to make its customers' workloads more manageable and portable across the entire infrastructure, from the edge to the core and the cloud.

"Enterprises are really looking for vendors and solutions that can help with an 'easy button'," he says, harking back to Staples' famous marketing campaign from the mid-2000s. Nutanix, which cut its teeth in hyperconverged infrastructure hardware, has been doing that for years following its concentration on cloud infrastructure software.

The move to AI and particularly generative AI has upped the ante for companies grappling with what can often be volatile, expensive workloads. Deploying LLMs in the cloud is easy, but you'll pay for the privilege, especially if all your developers start doing it at once. And deploying these compute- and connectivity-hungry assets on your own premises is harder still, Dutta warns. How do you spec the hardware accordingly? How do you handle capacity planning, and cost analysis?

"So how do you get that 'easy button' for deploying my large language models and all the other things you need to build AI agents?" he says. This need has sharpened as we've moved from simple chat bots, to RAG-based LLMs talking with private company data, to more complex agentic models made of multiple models.

This is where Nutanix Enterprise AI comes in, Dutta explains. It's a single control point to run all of a company's LLMs and agentic endpoints with three objectives: simplicity, full control, and predictable cost.

Nutanix Enterprise AI is now part of the GPT-in-a-Box 2.0 solution, which is the Nutanix full-stack solution for rapid generative AI deployments. The Enterprise AI part offers day-two operations and management capabilities for LLMs after customers have set up their pre-validated generative AI tools and use cases in GPT-in-a-Box 2.0.

Simply does it

The simplicity comes in the product's centralised architecture. It allows administrators to deploy LLMs from NVIDIA inference microservices (NIM) and Hugging Face, with options to upload custom models of their own, even in dark sites (disconnected environments). They can install and control these from a single point, either on bare-metal hardware of their own using Nutanix Kubernetes Platform, or running on CNCF-certified Kubernetes environments in the cloud such as those from Google, Amazon, and Microsoft.

Putting admins in control

The full control and the cost management aspects of Nutanix Enterprise AI are linked. After deployment, administrators can use Nutanix Enterprise AI to produce a secured access API token for each developer. Instead of accessing models directly, developers use these API tokens to access endpoints, which are instances of models running on a GPU-enabled infrastructure and exposed via a secured API. Admins can grant developers role-based access control to these endpoints.

That's a change from traditional, less mature approaches where developers could set up their own models autonomously - and it promises big gains in cost effectiveness.

"On average for any application, you'll see about four to five LLMs," Dutta says. "Now, imagine 100 of us trying to set those up. Enterprise IT has to deal with security, an extra management headache - and rising costs."

Handing admins the reins for these compute-intensive resources helps them to control model usage and manage costs more efficiently. "We've seen customers really appreciate the fact that there is one layer for the enterprise IT to have full control," Dutta explains.

Nutanix's work with NVIDIA

As a vendor-agnostic solution Nutanix works on a range of hardware, but that hasn't stopped it crafting partnerships with specific hardware partners for tighter integration. That naturally includes the 500lb gorilla in the room: NVIDIA.

Nutanix supports NVIDIA against its bare metal and Kubernetes deployments. Nutanix Enterprise AI ties into NVIDIA NIM for deploying and operating generative AI models. The Nutanix software makes it easier to deploy NIMs on GPUs wherever they're needed, from data centres to public clouds.

The Nutanix software also supports NVIDIA's Dynamo product, which is a distributed inference engine with caching capabilities. "These are amazing Lego blocks. But if 100 people are doing the same thing, it causes sprawl," Dutta says. Managing it via Nutanix Enterprise AI tames it for customers.

Working with NVIDIA enables Nutanix to validate and certify NIMs against its hardware partners' servers and GPUs, among other devices. That ensures that the NIMs are ready for operation, wherever Nutanix's customers decide to run them.

Nutanix has also certified its Enterprise AI software against NVIDIA's own AI Enterprise software stack, including NVIDIA’s Blueprints for common use cases and its full inference engine suite.

What's next for agentic AI?

Dutta says that this is just the beginning for agentic AI, which he envisages evolving at a rapid pace. Reasoning models open up new possibilities as they become more capable, he says.

"That kind of an analytical thinking process when applied to AI agents means that we are not very far away from creating digital minions," he says. He's quite happy with the idea of being a real-world Gru (but without the villainy of course), directing hundreds of cute little agentic characters in his digital workforce.

Individual minions won't be good at everything, warns Dutta: "Creating a minion that's good for everything is very hard and expensive from a computational and energy point of view." Instead, he foresees each agentic minion excelling at a relatively narrow task. Perhaps an appointment-booking agent here, one that's good at summarising ticket histories there. And maybe another that is adept at performing multi-source retrieval and ranking for research.

As these agentic systems - essentially beefed up AI-powered microservices - catch on, companies will need the ability to manage the fabric of compute-hungry services that they create. So Dutta sees a bright future for Nutanix as it helps customers to manage these services more efficiently for the developers that use them.

Sponsored by Nutanix

More about

TIP US OFF

Send us news