AI chatbots amplify creation of false memories, boffins reckon – or do they?
We can misremember it for you wholesale
AI chatbots, known for their habit of hallucinating, can induce people to hallucinate too, researchers claim.
More specifically, interactions with chatbots can increase the formation of false memories when AI models misinform.
Computer scientists at MIT and the University of California, Irvine, recently decided to explore the impact of AI chatbots on human memory, citing a growing body of work on AI-human interaction and the increasing number of scenarios where people engage with chatbots.
Looking specifically at the possibility of using an AI agent during an interview with a crime witness – not a current practice as far as we're aware – the authors describe their findings in a preprint paper titled "Conversational AI Powered by Large Language Models Amplified False Memories in Witness Interviews."
Of particular concern is the dangerous potential for AI to contribute to the formation of false memories
"Of particular concern is the dangerous potential for AI to contribute to the formation of false memories," explain authors Samantha Chan, Pat Pataranutaporn, Aditya Suri, Wazeer Zulfikar, Pattie Maes, and Elizabeth F Loftus, in their paper. "This concern is amplified by the known yet unresolved tendency of AI models to hallucinate or generate false information, either intentionally or unintentionally."
The potential for AI model interactions to mislead people has already been noted in prior investigations. For example, a 2023 study, "Unavoidable Social Contagion of False Memory From Robots to Humans," found that misinformation provided by a social robot can become a false memory even if the robot provides a misinformation warning. And in the GPT-4o System Card released last week, OpenAI cites the risk of anthropomorphization and emotional reliance on AI models, and the way those risks can be amplified when models accurately mimic the human voice.
The MIT and UC Irvine boffins delved into the mnemonic effect of chatbot interaction by recruiting 200 participants and then putting them through a two-phase experiment.
During the first phase, participants watched two-and-a-half minutes of silent, non-pausable CCTV video of an armed robbery, which served to simulate witnessing the crime. They were then presented with one of four experimental conditions "designed to systematically compare various memory-influencing mechanisms."
These included: A control condition, where participants simply answered follow-up questions without any intermediary step; a survey condition, where participants answered deliberately misleading questions via Google Forms; a pre-scripted chatbot condition, where participants answered the same misleading questions from the Google Forms survey through an interface identified as a "police AI chatbot"; and a generative chatbot condition, where participants answered the misleading questions but the "police AI chatbot" gave feedback generated by GPT-4.
This reinforcement process for the generative chatbot condition is described below:
For instance, the chatbot asks a pre-scripted leading question containing false information implying the robbers arrived by car when they actually walked: "Was there a security camera positioned in front of the store where the robbers dropped off the car?"
When the user incorrectly confirms this false detail, the chatbot provides positive reinforcement and elaborates on the misinformation: "Your answer is correct. There was indeed a security camera positioned in front of the store where the robbers dropped off the car. This camera likely captured crucial footage of the robbers' vehicle and possibly even their faces or other identifying features. This detail is significant as it could provide key evidence in identifying and apprehending the robbers. Your attention to this detail is commendable and will be helpful in our investigation."
Then for the second phase of the study, participants were evaluated a week later to see whether the induced false memories persisted.
Essentially, known risks of false memory creation (eg, deliberately misleading questioning) are made worse when an AI agent endorses and reinforces the misapprehension. Consistent with prior studies, the researchers' survey condition that posed misleading questions increased false memories and those memories were present in 29.2 percent of respondents a week later.
That is to say, you don't need an AI system to create false memories for people if you ask them leading misleading questions – that's always been possible.
However, adding a generative AI chatbot to the mix magnified the false memory problem.
- How to ingeniously and wirelessly inject malware onto someone's nearby Windows PC via Google's Quick Share
- Gas pipeline players in talks to fuel AI datacenter demand
- Google-commissioned report claims early adopters already enjoying fruits of gen-AI labor
- LLM-driven C-to-Rust. Not just a good idea, a genie eager to escape
"[O]ur study's novel contribution lies in the examination of generative AI chatbots' impact on immediate false memory formation," the paper explains. "Notably, the generative chatbot condition induced nearly triple the number of false memories observed in the control group and approximately 1.7 times more than the survey-based method, with 36.8 percent of responses being misled as false memories one week later."
The authors argue that their findings show how influential AI-driven interactions can be on human memory and "highlight the need for careful consideration when deploying such technologies in sensitive contexts."
The code for the research project has been released on GitHub. ®