Despite the hype, generative AI is not a significant chunk of enterprise cloud spend
Not to be a buzzkill, but let's take a deep dive into the disparity
Generative AI currently only makes up a small fraction of cloud computing costs for enterprises and cloud providers, despite all the hype.
The technology has been around for a while now, but didn't take off until OpenAI released its viral text-generating app ChatGPT last November. Suddenly large language models entered mainstream consciousness, helping people complete work and powering all sorts of new commercial applications.
Microsoft quickly rolled out its GPT-4-based Bing AI internet chatbot soon after, and Google and search engines soon followed suit. They showed that AI chatbots weren't designed just for question and answering, and could be a powerful versatile tool. Given input instructions, LLMs can perform other sorts of tasks like text summarization, classification, and search, and to do things like planning, reviewing reports, or writing docucments.
These capabilities ignited the imaginations of companies from Top 500 businesses to small startups, which began to think about how they might use the technology to boost productivity or cut costs. Industries like healthcare, legal, education, and more are shifting their attention to generative AI in a bid to not fall behind competitors.
"ChatGPT has spurred an increase in AI investment across the board because people see so much value in these generative capabilities," Jim Fare, a distinguished VP analyst at Gartner focused on AI, data science, and analytics," he told The Register. "ChatGPT and generative AI has become a board level and a C-suite type of conversation. They're trying to figure out 'how can we use this technology in our organizations more broadly?' and 'how do we automate more of our business?'," Fare added.
Generative AI, however, is still nascent. Although companies are keen to market themselves as forward-thinking and leaders of the trend, IT spending on the technology reveals a different story. The vast majority of AI cloud computing costs is mostly spent on predictive analytics, with other areas including computer vision, recommendation systems, or graph networks following behind. Generative AI is not a significant impact on bills for enterprises or revenues for cloud platforms quite yet.
"In the world of generative AI, it's sort of like the Wild West. I think a lot of organizations are still trying to figure it out. So we're not seeing the mega growth in terms of the impact on cloud spend, except for those providers that are building these large language models," Fare said.
Over the next few years, however, that is expected to change and enterprises will be spending huge amounts of cloud computing to support their generative AI products and services.
We’re starting to see large enterprises that have not typically been heavy investors in tech start to see the importance in value that LLMs can bring
Chetan Kapoor, director of product management for EC2 instances at AWS, agreed. He said Amazon has already bagged business from key generative AI players, including Anthropic and Stability AI. Next, it expects to work with, if not already, tech companies building specific products, such as Adobe, which is developing machine-learning-powered graphics applications. Amazon is also starting to get interest from companies that are less known for their investment in next-gen technologies.
"There is notable growth and active usage: [generative AI] is a small percentage of overall compute spending but it's growing four to five times faster than our standard business," Kapoor told us.
“We’re seeing a sizable increase in interest and usage from customers pretty much across the spectrum," he continued.
"It’s coming in different waves. We have been supporting LLM customers building on AWS for a year-and-a-half, two years already. Before ChatGPT happened we had key customers like Anthropic and Stability AI that have already been building LLMs on AWS.
"So that was the first wave where we had some key startups scaling on us. That wave has now shifted to large enterprises like Adobe scaling their development and deployment of LLMs on AWS. And now we’re starting to see a shift where we have large enterprises that have not typically been heavy investors in tech start to come up and see the importance in value that LLMs can bring to their business also.”
Show me the money
How much are companies spending exactly? It's a tricky question to answer since it depends on lots of different factors, Karl Freund, founder and principal analyst of Cambrian-AI Research, told The Register. Training and inference costs have steadily decreased over time. The software has matured, and developers are increasingly finding new ways to train and run models more efficiently. Hardware makers have also gotten better at optimizing the performance and throughput of crunching numbers.
It ultimately boils down to the sizes of the models they use and their workloads. Kapoor said there are different class for cloud providers like AWS. "You have experts in the market that want to build their own foundational model. They mostly want access to high performance, easily scalable, and just a large amount of compute."
"The second category of customers are going to be startups and enterprises that actually don't either have the expertise or don't have the desire to build their own LLMs. What they essentially want to do is take something that is available publicly, fine tune it based on their datasets and use it for their applications to whatever extent possible."
"And then finally, there's going to be a tier of customers that want to upgrade at the application layer. They don't want to fine tune [or build] any models. All they want to do is integrate the core generative AI functionality into their applications," he explained.
To handle the different types of use cases, cloud providers have to support various infrastructure services with different types of networking, storage, and compute capabilities. Different cloud companies offer different configurations of compute instances. The costs of training and running models will also depend on the providers that enterprises choose to go with.
The prices charged to spin up GPU clusters are impacted by demand and supply. Freund pointed out that some cloud platforms like CoreWeave, for example, are currently offering cheaper deals to rent their A100 and H100 GPUs compared to bigger rivals like AWS or GCP.
- GenAI proponents exploring its use alongside infrastructure-as-code
- AI to replace 2.4 million jobs in the US by 2030, many fewer than other forms of automation
- Google threatens to inject Duet AI bot into more corners of Workspace: Meet, Chat, etc
- Meta lets Code Llama run riot under almost-open terms
CoreWeave raised $100 million from investors Magnetar and Nvidia and secured a $2.3 billion debt facility collateralized by the latter's chips. Freund said it is in Nvidia's interest to start diversifying to other cloud providers, especially ones that weren't developing their own custom silicon to compete with GPUs, like Google's TPU, Amazon's Trainium, or Intel's Habana Gaudi accelerators. With more access to Nvidia's GPUs, CoreWeave can undercut rivals and woo its customers.
Ultimately, the real winners of the generative AI craze are the chipmakers and cloud companies. They control the resources needed to build AI, namely the hardware and infrastructure to support training and running models, as well as storing data too.
"The way I like to think of it is like it's like the gold rush. The people that made the most money were not the gold miners per se. It was the makers that made the shovels and picks. The ones that made things that made it easier for the gold miners to actually get out there and search for gold. And that's sort of what's happening here in the world of AI. It's the next gold rush," Hare said. ®
Editor's note: Story revised after publication to include the full quote and context to Chetan Kapoor's commentary on AWS customers using LLMs.