What's up with AI lately? Let's start with soaring costs, public anger, regulations...

'Obtaining genuine consent for training data collection is especially challenging' industry sages say

The Stanford Institute for Human-Centered Artificial Intelligence (HAI) has issued its seventh annual AI Index Report, which reports a thriving industry facing growing costs, regulations, and public concern.

The 502-page report [PDF] comes from academia and industry – the HAI steering committee is helmed by Anthropic co-founder Jack Clark, and Ray Perrault, a computer scientist in SRI International’s Artificial Intelligence Center – and thus doesn't dwell too much on burn-it-with-fire arguments.

To that point, the report defines privacy such that individuals have a right to consent to large language models (LLMs) using their data. Yet it does not propose AI firms should abandon existing models because they were built without permission. It suggests transparency, rather than penance.

"Obtaining genuine and informed consent for training data collection is especially challenging with LLMs, which rely on massive amounts of data," the report says. "In many cases, users are unaware of how their data is being used or the extent of its collection. Therefore, it is important to ensure transparency around data collection practice."

The outcome of several pending lawsuits, like the case against GitHub's Copilot, could mean that transparency is not enough, that AI training data requires explicit permission and perhaps prohibitive payments.

But assuming AI is here to stay and must be reckoned with in its current form, the report succeeds in highlighting the promise and peril of automated decision making.

"Our mission is to provide unbiased, rigorously vetted, broadly sourced data in order for policymakers, researchers, executives, journalists, and the general public to develop a more thorough and nuanced understanding of the complex field of AI," the report explains.

Some of the report's top findings are not particularly surprising, like "AI beats humans on some tasks, but not all," and "Industry continues to dominate frontier AI-research."

Toward the latter point, the report says that industry produced 51 noteworthy machine learning models, compared to 15 from academia and 21 from industry-academia collaborations.

While closed models (eg, GPT-4, Gemini) outperformed open source models on a set of 10 AI benchmarks, open source models are becoming more common. Of 149 foundation models released in 2023, 65.7 percent were open source, compared to 44.4 percent in 2022 and 33.3 percent in 2021.

Whether that trend continues may be related to another top finding: "Frontier models get way more expensive." That is to say open source models look unlikely to become more competitive with their closed source rivals if the cost to train a cutting edge AI model becomes something only the well-funded can contemplate.

"According to AI Index estimates, the median costs of training frontier AI models nearly doubled in the last year," the report says. "The training costs of state-of-the-art models have especially reached unprecedented levels. For example, OpenAI's GPT-4 used an estimated $78 million worth of compute to train, while Google's Gemini Ultra cost $191 million for compute."

There's already some doubt that AI is worth the money. A January study from MIT CSAIL, MIT Sloan, The Productivity Institute, and IBM’s Institute for Business Value found that "it's only economically sensible to replace human labor with AI in about one-fourth of the jobs where vision is a key component of the work." And a recent Wall Street Journal report indicates tech firms haven't necessarily found a way to make AI investments pay off.

Hence all the added fees for services augmented with AI.

When considered alongside other HAI report findings like "In the US, AI regulations sharply increase," AI model training looks likely to become even more capital intensive. In the US last year, the report says, there were 25 AI-related regulations - up from one in 2016 - and these will bring additional costs.

Another finding that may lead to more regulations, and thus compliance costs, is the way people feel about AI. "People across the globe are more cognizant of AI’s potential impact – and more nervous," the report says. It cites an increase in the number of people who think AI will impact their lives in the next three to five years (66 percent, up six percentage points) and in the number of people who are nervous about AI (52 percent, up 13 percentage points).

A further potential source of trouble for AI firms comes from the lack of evaluation standards for LLMs, a situation which allows AI firms to select their own benchmarks for testing. "This practice complicates efforts to systematically compare the risks and limitations of top AI models," the report says.

The HAI report posits that AI enhances worker productivity and accelerates scientific progress, citing DeepMind's GNoME, "which facilitates the process of materials discovery."

While AI automation has been shown to enhance productivity in specific tasks, its usefulness as a source of ideas remains a matter of debate. As we reported recently, there's some skepticism still about the value of AI-aided predictions for viable new materials, for example.

Be that as it may, big bets are being made on AI. Generative AI investments increased eightfold, from $3 billion in 2022 to $25.2 billion in 2023. And the US is presently the top source of AI systems, with 61 notable AI models in 2023, compared to 21 from the European Union, and 15 from China.

"AI faces two interrelated futures," write Clark and Perrault. "First, technology continues to improve and is increasingly used, having major consequences for productivity and employment. It can be put to both good and bad uses. In the second future, the adoption of AI is constrained by the limitations of the technology."

Over the next few years, we should see which of those two futures will dominate. ®

More about


Send us news

Other stories you might like