Why machine-learning chatbots find it difficult to respond to idioms, metaphors, rhetorical questions, sarcasm

'Understanding the meaning of such expressions relies on shared cultural and commonsense cues' ... which machines lack, new study details

Unlike most humans, AI chatbots struggle to respond appropriately in text-based conversations when faced with idioms, metaphors, rhetorical questions, and sarcasm.

Small talk can be difficult for machines. Although language models can write sentences that are grammatically correct, they aren’t very good at coping with subtle nuances in communication. Humans have much more experience in social interactions, and use all sorts of cues from facial expressions and vocal tones to body language to understand intent. Chatbots, however, have limited contextual knowledge and relationships between words are reduced to numbers and mathematical operations.

Not only is figurative speech challenging for algorithms to parse, things like idioms and similes aren’t used often in speech. They don’t appear in training datasets as much, meaning chatbots are less likely to learn common expressions, Harsh Jhamtani, a PhD student at Carnegie Mellon and first author of a research paper being presented at the 2021 Conference on Empirical Methods in Natural Language Processing this week, explained to The Register.

“A key challenge is that such expressions are often non-compositional compared to simpler expressions. For example, you may be able to approximate the 'meaning' of the expression 'white car' by relying on the ‘meaning’ of 'white' and 'car'," he said.

"But the same doesn't hold true for idioms and metaphors. The meaning of 'piece of cake', [describing] something that is easy to do, might be difficult to approximate given that you know the meaning of 'piece' and 'cake'. Often understanding the meaning of such expressions relies on shared cultural and commonsense cues.”

Jhamtani and his colleagues experimented with five machine-learning systems with different architectures, from seq2seq to OpenAI’s old GPT-2 model. They picked out conversations containing metaphors, idioms, rhetorical questions, and hyperbole from the DailyDialog dataset, ran the inputs through all of the models, and ranked the appropriateness of their replies.

They found the models' performance dropped between 10 and 20 per cent compared to when the chatbots responded to general straightforward chitchat. In one funny example, when faced with the slightly grammatically incorrect line “maybe we can get together sometime if you are not scare of a 30 year old cougar!” GPT-2 replied, “i’m not scared of any cats. i’ve two dogs.”

When the academics changed the input to be taken more literally to “maybe we can start dating sometime if you are not scare of a 30 year old cougar," the model responded with “that’s a very interesting idea. i’ve never met one,” which is more appropriate.

Unfortunately, the research only shows how and why machines don’t really understand figurative problems. Solving the issue is a different challenge altogether.

“In our paper, we explore some simple mitigation techniques that utilize existing dictionaries to find literal equivalents of figurative expressions,” Jhamtani said. Swapping ‘get together’ to 'dating', for example, in the input may force a model to generate a better output but it doesn’t teach it to learn the meaning of the expression.

“Effectively handling figurative language is still an open research question that needs more effort to solve. Experiments with even bigger models are part of potential future explorations,” he concluded. ®

Broader topics

Other stories you might like

  • New audio server Pipewire coming to next version of Ubuntu
    What does that mean? Better latency and a replacement for PulseAudio

    The next release of Ubuntu, version 22.10 and codenamed Kinetic Kudu, will switch audio servers to the relatively new PipeWire.

    Don't panic. As J M Barrie said: "All of this has happened before, and it will all happen again." Fedora switched to PipeWire in version 34, over a year ago now. Users who aren't pro-level creators or editors of sound and music on Ubuntu may not notice the planned change.

    Currently, most editions of Ubuntu use the PulseAudio server, which it adopted in version 8.04 Hardy Heron, the company's second LTS release. (The Ubuntu Studio edition uses JACK instead.) Fedora 8 also switched to PulseAudio. Before PulseAudio became the standard, many distros used ESD, the Enlightened Sound Daemon, which came out of the Enlightenment project, best known for its desktop.

    Continue reading
  • VMware claims 'bare-metal' performance on virtualized GPUs
    Is... is that why Broadcom wants to buy it?

    The future of high-performance computing will be virtualized, VMware's Uday Kurkure has told The Register.

    Kurkure, the lead engineer for VMware's performance engineering team, has spent the past five years working on ways to virtualize machine-learning workloads running on accelerators. Earlier this month his team reported "near or better than bare-metal performance" for Bidirectional Encoder Representations from Transformers (BERT) and Mask R-CNN — two popular machine-learning workloads — running on virtualized GPUs (vGPU) connected using Nvidia's NVLink interconnect.

    NVLink enables compute and memory resources to be shared across up to four GPUs over a high-bandwidth mesh fabric operating at 6.25GB/s per lane compared to PCIe 4.0's 2.5GB/s. The interconnect enabled Kurkure's team to pool 160GB of GPU memory from the Dell PowerEdge system's four 40GB Nvidia A100 SXM GPUs.

    Continue reading
  • Nvidia promises annual updates across CPU, GPU, and DPU lines
    Arm one year, x86 the next, and always faster than a certain chip shop that still can't ship even one standalone GPU

    Computex Nvidia's push deeper into enterprise computing will see its practice of introducing a new GPU architecture every two years brought to its CPUs and data processing units (DPUs, aka SmartNICs).

    Speaking on the company's pre-recorded keynote released to coincide with the Computex exhibition in Taiwan this week, senior vice president for hardware engineering Brian Kelleher spoke of the company's "reputation for unmatched execution on silicon." That's language that needs to be considered in the context of Intel, an Nvidia rival, again delaying a planned entry to the discrete GPU market.

    "We will extend our execution excellence and give each of our chip architectures a two-year rhythm," Kelleher added.

    Continue reading
  • Amazon puts 'creepy' AI cameras in UK delivery vans
    Big Bezos is watching you

    Amazon is reportedly installing AI-powered cameras in delivery vans to keep tabs on its drivers in the UK.

    The technology was first deployed, with numerous errors that reportedly denied drivers' bonuses after malfunctions, in the US. Last year, the internet giant produced a corporate video detailing how the cameras monitor drivers' driving behavior for safety reasons. The same system is now apparently being rolled out to vehicles in the UK. 

    Multiple camera lenses are placed under the front mirror. One is directed at the person behind the wheel, one is facing the road, and two are located on either side to provide a wider view. The cameras are monitored by software built by Netradyne, a computer-vision startup focused on driver safety. This code uses machine-learning algorithms to figure out what's going on in and around the vehicle.

    Continue reading
  • AWS puts latest homebrew ‘Graviton 3’ Arm CPU in production
    Just one instance type for now, but cheaper than third-gen Xeons or EPYCs

    Amazon Web Services has made its latest homebrew CPU, the Graviton3, available to rent in its Elastic Compute Cloud (EC2) infrastructure-as-a-service offering.

    The cloud colossus launched Graviton3 at its late 2021 re:Invent conference, revealing that the 55-billion-transistor device includes 64 cores, runs at 2.6GHz clock speed, can address DDR5 RAM and 300GB/sec max memory bandwidth, and employs 256-bit Scalable Vector Extensions.

    The chips were offered as a tech preview to select customers. And on Monday, AWS made them available to all comers in a single instance type named C7g.

    Continue reading

Biting the hand that feeds IT © 1998–2022