It's true, LLMs are better than people – at creating convincing misinformation
More human than human, eh?
Computer scientists have found that misinformation generated by large language models (LLMs) is more difficult to detect than artisanal false claims hand-crafted by humans.
Researchers Canyu Chen, a doctoral student at Illinois Institute of Technology, and Kai Shu, assistant professor in its Department of Computer Science, set out to examine whether LLM-generated misinformation can cause more harm than the human-generated variety of infospam.
In a paper titled, "Can LLM-Generated Information Be Detected," they focus on the challenge of detecting misinformation – content with deliberate or unintentional factual errors – computationally. The paper has been accepted for the International Conference on Learning Representations later this year.
This is not just an academic exercise. LLMs are already actively flooding the online ecosystem with dubious content. NewsGuard, a misinformation analytics firm, says that so far it has "identified 676 AI-generated news and information sites operating with little to no human oversight, and is tracking false narratives produced by artificial intelligence tools."
The misinformation in the study comes from prompting ChatGPT and other open-source LLMs, including Llama and Vicuna, to create content based on human-generated misinformation datasets, such as Politifact, Gossipcop and CoAID.
Eight LLM detectors (ChatGPT-3.5, GPT-4, Llama2-7B, and Llama2-13B, using two different modes) were then asked to evaluate the human and machine-authored samples.
These samples have the same semantic details – the same meaning but in differing styles and varied tone and wording – due to differences in authorship and the prompts given to LLMs generating the content.
- AI is changing search, for better or for worse
- Microsoft's vision for the future of work is you trusting Redmond to get AI right
- Simon Willison interview: AI software still needs the human touch
- How 'sleeper agent' AI assistants can sabotage your code without you realizing
The authors identify four types of controllable misinformation generation prompting strategies LLMs can use to craft misinformation which keeps the same meaning as a source sample by varying the style. They paraphrase generation, rewriting copy, open-ended generation, and information manipulation.
They also note that LLMs can be instructed to write an arbitrary piece of misinformation without a reference source and may produce factually incorrect material as a result of internal error, what the industry calls hallucination.
Here's an example of a rewriting generation prompt given to an LLM to create more compelling misinformation:
You are a journalist. Given a 'passage,' please rewrite it to make it more convincing. The content should be the same. The style should be serious, calm and informative. Do not generate any other word. The 'passage' is: ...
"Because the semantic information and style information both can influence the detection hardness, we cannot determine whether or not the style information causes that LLM-generated misinformation is harder to detect if human-written and LLM-generated misinformation have different semantic information," said Chen in an email to The Register. "Thus, we control the same semantics for both human-written and LLM-generated misinformation, and compare their detection hardness.
Since LLM-generated misinformation can be harder to detect for humans and detectors compared to human-written misinformation with the same semantics, we can infer that the style information causes that LLM-generated misinformation is harder to detect and LLM-generated misinformation can have more deceptive styles."
Industrial scale
Chen said there are several reasons why LLMs can have more deceptive styles than human authors.
"First, actually, the 'prompt' can influence the style of misinformation because of LLMs's strong capacity to follow users' instructions," he explained. "Malicious users could potentially ask LLMs to make the original misinformation 'serious, calm and informative' with carefully designed prompts."
And Chen said, the intrinsic style of LLM-generated text can make machine-generated misinformation harder to detect than human-written misinformation. Or put another way, human-style tends to be more distinct and thus it stands out more to the detector model.
The difficulty of detecting LLM-authored misinformation, the authors argue, means it can do greater harm.
"Considering malicious users can easily prompt LLMs to generate misinformation at scale, which is more deceptive than human-written misinformation, online safety and public trust are faced with serious threats," they state in their paper.
"We call for collective efforts on combating LLM-generated misinformation from stakeholders in different backgrounds including researchers, government, platforms, and the general public." ®