Teradata introduces LLMs to predictive analytics
But the outcome is far from certain
Teradata is hitting all the major clouds with its VantageCloud Lake technology on Microsoft Azure following a similar deal with AWS, and with a deployment in Google Cloud expected in the first half of next year.
A continuation of the enterprise data warehousing stalwart's passage into the more unstructured world of data lakes, the move increases Teradata's proximity to large language models (LLMs) through Microsoft's $10 billion investment in OpenAI, a decision the vendor was at pains to underscore this week.
Microsoft is the arguably the leader among cloud vendors in terms of the possibilities of LLMs for business following the noteworthy results from OpenAI GPT-4 in question answering and generating computer code. Now Teradata – which counts HSBC, Unilever and American Airlines among its data warehousing customers – is singing the same tune.
"Generative AI and LLMs are set to fundamentally transform every industry, dramatically amplifying what people can achieve," Hillary Ashton, chief product officer told us.
Its position is far from unique. In June, cloud-based data warehouse Snowflake announced an expanded partnership with Microsoft including product integrations with OpenAI and Azure ML.
"Our integrations with Microsoft's generative AI and LLM services will enable joint customers to leverage the latest AI models and frameworks, enhancing the productivity of developers," said Snowflake chief revenue officer Chris Degnan.
Snowflake is a more recent arrival on the data warehousing and analytics scene, but Teradata's roots date from 1979. While the company has been updated for the cloud – it had specialized in appliance servers to crunch numbers – it can still brings historical clout to business data. All of which raises the question of whether LLMs, embodied by neural networks and relying on representing words or sentences as vectors, can outperform the analytics techniques Teradata has been introducing to businesses over the last 40 years.
Resisting the temptation to make a prediction, Teradata's Ashton said: "The future has yet to be written on that."
She said that Teradata customers had started to trial LLMs for predictive analytics. For example, one retailer had used them to produce individually tailored special offers for customers using self-service scanning technology while shopping in the store.
The "next-best-offer" problem is well understood in business analytics, and it is not clear whether LLMs would perform better than existing techniques – dubbed propensity analytics – as "the Smart Cart example is probably unique in terms of a few things," Ashton said.
Instead of picking winners, Teradata's Vantage analytics platform can set up champion-challenger comparisons to figure out which technique is best at predicting future outcomes on trial datasets.
- Teradata chases hyperscaler, SI partnerships in cloud push
- Teradata CTO Stephen Brobst drowns data lakehouse concept
- Teradata takes on cloud-native rivals with data lakes, MLOps
- Teradata to take $60m hit for withdrawal from Russia
"In some use cases, I'd expect that large language models will absolutely outperform traditional advanced analytic capabilities. In other cases, we'll probably find that the way we traditionally think about predictive analytics gives you either the best outcome and remains the way to move forward," she said.
LLMs were merely another "arrow in your quiver," she said.
Donald Farmer, analyst with Tree Hive Strategy, said there was a certain amount of "fuzziness in the marketing claims" around LLMs.
"Everybody wants to jump on the LLM bandwagon – in fact, everybody needs to. But the specific use cases and any advantages are often poorly thought through," he said.
The analytic capabilities of an LLM is much less well understood than the linguistic modeling they were originally designed for, he added. Nonetheless, they might be able to compete with Markov chains, the math used to capture probabilistic rules.
"You can think of LLMs as really doing sequence analysis," he said. "Predicting the next word in a sentence, or reproducing a stylistic pattern is, in essence, a sequence analysis task. So, LLMs trained over the right data in the right way could be used in scenarios where previously Markov chains were the best option. Next-best-offer, website clickstream analytics, some fraud detection, some advanced market basket analysis (what will be put in the basket next?), some trading predictions, credit scoring and so on.
"To be clear, Markov chains are probably still the best option for these, but the use of LLMs to do this at scale – and perhaps without requiring the deep technical understanding to build a Markov chain analysis – [could be] very tempting to look at."
While LLMs and generative AI have captured the public imagination and investors' wallets, how they will work in business is yet to be mapped out. Their application in predictive analytics is not yet proven, despite vendor promises that they will "fundamentally transform" industries. ®