Study: While text-generating AI can write like humans, it lacks common sense

Humans live in the real world, machines don't, say academics

AI software may be able to generate text that is grammatically correct and very human-like, but when it comes to common sense they’re still lagging severely behind us humans.

A team of computer scientists from the University of Southern California (USC), the University of Washington, and the Allen Institute for Artificial Intelligence, all in the US, devised a new test examining verbal reasoning skills in machine learning systems. Given a list of simple nouns and verbs, the natural language processing models were tasked with stringing together a sentence to describe a common scenario.

For example, the words “dog”, “frisbee”, “throw”, “catch” prompted one model to generate the sentence: “Two dogs are throwing frisbees at each other.” Although the text is coherent, it’s not something that humans would come up with. The idea of canines playing a game of frisbee isn’t too outlandish, but it’s more plausible that it’d be a human throwing an object for a dog to catch.

“In fact, in our paper, the AI models’ generation is also mostly correct grammatically,” Yuchen Lin, a PhD student at USC, told The Register.

“Their problem is low plausibility - AI generations are either very unusual or impossible in everyday life. For example, “a trash bin is under or on the table” are both grammatically correct but ‘under’ is better for common sense.“

AI doctor

Researchers made an OpenAI GPT-3 medical chatbot as an experiment. It told a mock patient to kill themselves


The researchers built a dataset made up of 35,141 scenarios described using 77,449 sentences generated by humans, and have tested eight different language models so far. The best performing one known as KG-BART, developed by academics at the University of Chicago, had an accuracy rate of 32.7 per cent compared to Google’s T5-Base model at 22 per cent, according to the leaderboard. All machine learning systems, however, scored lower than humans, who were generally accurate 63.5 per cent of the time.

“For evaluating a model for our proposed task, we use several popular automatic metrics for machine generation: BLEU, METEOR, CiDER, and SPICE. These metrics are basically programs that can give a score between model generations and human references that we collect from many people,” Lin explained.

“BLEU and METEOR are more designed for tasks that machine translation which have a focus on exact word match. Rather, CiDER and SPICE are designed for storytelling, and thus are more suitable for our tasks because we are also open to different scenarios.”

Lin and his colleagues suggest that if AI models don’t have common sense, applications like voice-activated assistants or robots will be prone to mistakes when interacting with humans. Neural networks often fail to develop reasoning skills because they rely on memorizing their training datasets and don’t have a real-world understanding.

“Current machine text-generation models can write an article that may be convincing to many humans, but they’re basically mimicking what they have seen in the training phase,” said Lin.

He hopes that by developing the common sense test, researchers will be able to build better algorithms in the future. “By introducing common sense and other domain-specific knowledge to machines, I believe that one day we can see AI agents such as Samantha in the movie Her that generate natural responses and interact with our lives,” he concluded. ®

Broader topics

Other stories you might like

  • Experts: AI should be recognized as inventors in patent law
    Plus: Police release deepfake of murdered teen in cold case, and more

    In-brief Governments around the world should pass intellectual property laws that grant rights to AI systems, two academics at the University of New South Wales in Australia argued.

    Alexandra George, and Toby Walsh, professors of law and AI, respectively, believe failing to recognize machines as inventors could have long-lasting impacts on economies and societies. 

    "If courts and governments decide that AI-made inventions cannot be patented, the implications could be huge," they wrote in a comment article published in Nature. "Funders and businesses would be less incentivized to pursue useful research using AI inventors when a return on their investment could be limited. Society could miss out on the development of worthwhile and life-saving inventions."

    Continue reading
  • Declassified and released: More secret files on US govt's emergency doomsday powers
    Nuke incoming? Quick break out the plans for rationing, censorship, property seizures, and more

    More papers describing the orders and messages the US President can issue in the event of apocalyptic crises, such as a devastating nuclear attack, have been declassified and released for all to see.

    These government files are part of a larger collection of records that discuss the nature, reach, and use of secret Presidential Emergency Action Documents: these are executive orders, announcements, and statements to Congress that are all ready to sign and send out as soon as a doomsday scenario occurs. PEADs are supposed to give America's commander-in-chief immediate extraordinary powers to overcome extraordinary events.

    PEADs have never been declassified or revealed before. They remain hush-hush, and their exact details are not publicly known.

    Continue reading
  • Stolen university credentials up for sale by Russian crooks, FBI warns
    Forget dark-web souks, thousands of these are already being traded on public bazaars

    Russian crooks are selling network credentials and virtual private network access for a "multitude" of US universities and colleges on criminal marketplaces, according to the FBI.

    According to a warning issued on Thursday, these stolen credentials sell for thousands of dollars on both dark web and public internet forums, and could lead to subsequent cyberattacks against individual employees or the schools themselves.

    "The exposure of usernames and passwords can lead to brute force credential stuffing computer network attacks, whereby attackers attempt logins across various internet sites or exploit them for subsequent cyber attacks as criminal actors take advantage of users recycling the same credentials across multiple accounts, internet sites, and services," the Feds' alert [PDF] said.

    Continue reading

Biting the hand that feeds IT © 1998–2022