Dell cosies up to Meta to tame Llama 2 AI beast on-prem

Spitting in the cloud's eye

Dell has teamed up with Facebook parent Meta to try to make it easier for customers to deploy the Llama 2 large language model (LLM) on premises rather than access it via the cloud.

There is a market for enterprise customers wanting to deploy and run Meta's AI model using their own IT infrastructure, says Dell, and the aim is to become the preferred provider of that kit.

This centers on Dell's Validated Design for Generative AI portfolio - pre-tested hardware builds announced this year, jointly engineered with GPU maker Nvidia. Combined with this, Dell is offering deployment and configuration guidance to get customers up and running in a shorter time.

As an example, Dell has integrated the Llama 2 models into its system sizing tools to guide customers to the right configuration for what they want to achieve.

Dell's chief AI officer, Jeff Boudreau, said in a canned statement that generative AI models including Llama 2 have the potential to "transform how industries operate and innovate."

"With the Dell and Meta technology collaboration, we're making open source GenAI more accessible to all customers, through detailed implementation guidance paired with the optimal software and hardware infrastructure for deployments of all sizes," he said.

Llama 2 was made available in July as a set of pre-trained and fine-tuned language models, coming in three different sizes; one with seven billion parameters, 13 billion, and another with 70 billion, which have differing hardware requirements.

The model is free to download for research and some commercial use is supported. Meta has already worked with Microsoft and Amazon to make it available on the Azure and AWS cloud platforms.

There is some controversy over calling Llama 2 open source when it is not available under a license approved by the Open Source Initiative (OSI), as The Register noted at the time.

Dell's Validated Designs for Generative AI were unveiled in August, combining the company's server kit with Nvidia GPUs, storage, and software such as Nvidia's AI Enterprise suite. The company confirmed these alongside professional services to get customers up and running with generative AI – for a price, of course.

The validated designs are aimed at inferencing work, for applications involving natural language generation, such as chatbots and virtual assistants, as well as marketing and content creation, although Dell has since expanded the portfolio to support customization and tuning of models.

According to Dell, Llama 2 with 7 billion parameters can be run with just a single GPU, while the 13 billion parameter version requires two GPUs, and the 70 billion version calls for eight. Dell outlined in a blog how the 7 billion and 13 billion parameter versions can be deployed to the PowerEdge R760xa system, while the 70 billion parameter version needs something like the PowerEdge XE9680 server because of the requirement for eight GPUs. ®

More about


Send us news

Other stories you might like