Software

AI + ML

Worry not. China's on the line saying AGI still a long way off

Instead of Turing Test, subject models to this Survival Game to assess intelligence, scientist tells The Reg


In 1950, Alan Turing proposed the Imitation Game, better known as the Turing Test, to identify when a computer's response to questions becomes convincing enough that the interrogator believes the machine could be human.

Generative AI models have passed the Turing Test and now the tech industry is focused on Artificial General Intelligence (AGI), the hypothetical point at which a computer can understand or learn any intellectual task as well as a human.

Presently, AGI is vaguely defined and does not exist, though there are people already trying to prevent its emergence. Among AI boosters, AGI is a bit like quantum computing – a distant goal cited for funding.

Citing intelligence tests devised by Turing and others – though disappointingly not the Voight-Kampff test from Blade Runner – researchers in China have proposed a method called the Survival Game to determine whether AI models qualify as AGI.

The Survival Game is essentially a simplified form of natural selection

Authors Jingtao Zhan, Jiahao Zhao, Jiayu Li, Yiqun Liu, Bo Zhang, Qingyao Ai, Jiaxin Mao, Hongning Wang, Min Zhang, and Shaoping Ma – affiliated with Tsinghua University and Renmin University of China – describe their approach in a preprint paper titled: "Evaluating Intelligence via Trial and Error."

"The main idea behind this paper is to assess whether current AI systems can find solutions through continuous trial and error," Jingtao Zhan, a PhD student in computer science at Tsinghua University and corresponding author, told The Register.

"If an AI system can find a solution within a limited number of attempts, it is considered to 'survive'; otherwise, it 'goes extinct.'"

Models that survive are allowed to progress to other tests; ones that don't pass get retrained until they do, which is a significant process.

The Survival Game covers various knowledge domains. In image classification, for example, the test assesses how many trial-and-error attempts are required before the model comes up with a correct classification. In question answering, models are tested against three well-known datasets: MMLU-Pro, NQ, and TriviaQA. In mathematics, the test measures performance using three math datasets: CMath, GSM8K, and the MATH competition dataset.

Supporting code has been published to GitHub.

"The Survival Game is essentially a simplified form of natural selection, and we aim to use this approach to test whether AI can adapt and learn through such a mechanism," said Zhan.

"If an AI system passes this test, it means it can autonomously find solutions without human supervision and operate independently. This serves as both my perspective on AGI and a way to evaluate it."

The researchers' results suggest that even if Moore's Law – the projected doubling of chip transistor density every two years – were to continue beyond its arguable demise in 2016, the cost to build a neural network capable of passing the above AGI tests would be exorbitant and it would take 70 years for hardware to be able to support the anticipated model.

"Projections suggest that achieving the autonomous level for general tasks would require 1026 parameters," the paper says.

That's a huge number: "Five orders of magnitude higher than the total number of neurons in all of humanity’s brains combined," the authors observe, where a human brain has 1011 neurons and population is approaching about 10^10 people for a neuron total of 10^21.

Setting aside computation costs such as training and inference, just loading a model with that many parameters onto Nvidia H100 GPUs would be an untenable extravagance.

They struggle significantly when faced with problems that require continuous trial and error to find solutions

"Since the memory of an H100 GPU is 80GB, we would need 5 × 1015 GPUs," the paper says. "Based on the cost of H100 GPUs ($30,000) and the market value of Apple Inc ($3.7 trillion) in February 2025, the total value of these GPUs would be equivalent to 4 × 107 times the market value of Apple. As we can see, without breakthroughs in hardware and AI technology, it is infeasible to afford scaling for autonomous-level intelligence."

Zhan argues these results indicate AI technology has a long way to go before it can autonomously solve unknown problems, particularly in an open environment where it must adapt through natural selection.

"While current AI systems may perform well in certain benchmarks, achieving high accuracy in predefined tasks, they struggle significantly when faced with problems that require continuous trial and error to find solutions," said Zhan.

The study, Zhan observes, shows that when AI models fail, they rarely adapt to come up with a correct solution through iterative attempts.

"In the Survival Game, this means it cannot survive," said Zhan. "Such trial-and-error learning is crucial in real-world applications, particularly in areas like tool use, autonomous agents, and self-driving cars. If AI can truly learn to solve problems through trial and error, it will mark a significant step toward widespread real-world deployment."

Food for thought. Whether you agree with the team's methodology and approach or not, and some of us here are a little skeptical of the study, we welcome people trying to calculate the trajectory of AI technology without the hype or grift. ®

Send us news
42 Comments

Apps-from-prompts Firebase Studio is a great example – of why AI can't replace devs

Big G reckons this agentic IDE speeds up or simplifies coding. Developers who've used it aren't so sure

AI entrepreneur sent avatar to argue in court – and the judge shut it down fast

We hear from court-scolded Jerome Dewald, who insists lawyer-bots have a future

LLMs can't stop making up software dependencies and sabotaging everything

Hallucinated package names fuel 'slopsquatting'

Procter & Gamble study finds AI could help make Pringles tastier, spice up Old Spice, sharpen Gillette

Go on, then, knock yourself out, pal

Congress wants to know if Nvidia superchips slipped through Singapore to DeepSeek

As Huang jets to Middle Kingdom after H20 ban forces $5.5B hit

Writing for humans? Perhaps in future we'll write specifically for AI – and be paid for it

'There needs to be a better economic as well as copyright framework', Thomson Reuters CPO tells us

Billions pour into AI as emissions rise, returns stay pitiful, say Stanford boffins

Models get bulkier, burnier, bank-breakier

Microsoft: Why not let our Copilot fly your computer?

Redmond talks up preview of AI agents navigating apps through the UI

Founder of facial-rec controversy biz Clearview AI booted from board

From wanting to weed out far-Left, anti-Trump migrants to amassing a huge database of internet photos

First Nvidia, now AMD: Trump trade turmoil threatens $800M in China chip sales

Is that MI in MI308 going to be Mission Impossible?

Nvidia paid $1M for Mar-a-Lago meal, US later scrapped AI chip export crackdown

Best after-dinner mint ever

Asian tech players react to US tariffs with delays, doubts, deal-making

PLUS: Qualcomm acquires Vietnamese AI outfit; China claims US hacked winter games; India's browser challenge winner disputed; and more