Generative AI slashes cloud migration hassles, says McKinsey partner

Bhargs Srivathsan also urges enterprises to ditch the tech Lamborghinis for efficient ride

The use of generative AI is cutting down cloud migration efforts by 30 percent to 50 percent when done correctly, according to McKinsey's Bhargs Srivathsan, speaking at a conference in Singapore on Wednesday.

"This is only starting to scratch the surface. As the large language model (LLM) matures, this timeline to migrate workloads to public cloud will just keep going down – and hopefully the migration will also be efficient," said Srivathsan.

She advised that organizations can use an LLM to understand what a system's infrastructure looks like, decipher its weaknesses and strengths, move the workloads, and then apply AI-based tools to understand if the migration actually worked.

It can also lean on LLMs for tasks such as producing Architecture Review Board guidelines.

The McKinsey partner said although many enterprises are just starting to think about implementing AI, 40 percent of those that have already invested are updating their investment.

Srivathsan called the relationship between generative AI and cloud "symbiotic."

"We absolutely cannot deny that cloud really is needed to bring generative AI to life. And, similarly, generative AI can really accelerate the migration to public cloud, as well as the unlock from it," she said.

The four biggest use cases Srivathsan sees for generative AI are content generation, customer engagement, creating synthetic data, and coding. The latter role is especially useful when working on legacy code written by coders who are long gone from an organization, or when such code needs translating to a new programming language.

She stressed the need to use public cloud instead of attempting to build in-house models as enterprises don't typically have the necessary access to GPUs. Off-the-shelf models are also cheaper.

Those working in regulated industries, with large amounts of proprietary data or concerned about IP infringement, can put guardrails in place, Srivathsan said.

She also stressed that LLMs will stay in hyperscale environments for the next five to six years until models mature. While many people feel the need to be closer to their compute power, she advised that very few use cases actually need ultra low latency.

Unless you are Tesla operating self-driving or perhaps operating in real time on a manufacturing floor, it's just not necessary, she said.

There is also no need to require tailored or massive models.

"Many enterprises think they need a Lamborghini to deliver a pizza, you definitely don't, you probably don't need as complex and as big as a model, you don't need a 65 billion parametric model to generate customer support scripts, for example," said the McKinsey partner.

However, she recommended not skimping on an API gateway between an organization and the outside world in order to surface some of the "real-time alerts" if developers are accessing non-proprietary models or data they shouldn't be touching. ®

More about


Send us news

Other stories you might like