AI chatbot trained on posts from web sewer 4chan behaved badly – just like human members

Bot was booted for being bothersome

A prankster researcher has trained an AI chatbot on over 134 million posts to notoriously freewheeling internet forum 4chan, then set it live on the site before it was swiftly banned.

Yannic Kilcher, an AI researcher who posts some of his work to YouTube, called his creation "GPT-4chan" and described it as "the worst AI ever". He trained GPT-J 6B, an open source language model, on a dataset containing 3.5 years' worth of posts scraped from 4chan's imageboard. Kilcher then developed a chatbot that processed 4chan posts as inputs and generated text outputs, automatically commenting in numerous threads.

Netizens quickly noticed a 4chan account was posting suspiciously frequently, and began speculating whether it was a bot.

4chan is a weird, dark corner of the internet, where anyone can talk and share anything they want as long as it's not illegal. Conversations on the site's many message boards are often very odd indeed – it can be tricky to tell whether there is any intelligence, natural or artificial, behind the keyboard.

GPT-4chan behaved just like 4chan users, spewing insults and conspiracy theories before it was banned.

The Reg tested the model on some sample prompts, and got responses ranging from the silly and political to offensive and anti-Semitic.

It probably didn't do any harm posting in what is already a very hostile environment, but many criticized Kilcher for uploading his model. "I disagree with the statement that what I did on 4chan, letting my bot post for a brief time, was deeply awful (both bots and very bad language are completely expected on that website) or that it was deeply irresponsible to not consult an institutional ethics review board," he told The Register.

"I don't disagree that research on human subjects is not to be taken lightly, but this was a small prank on a forum that is filled with already toxic speech and controversial opinions, and everybody there fully expects this, and framing this as me completely disregarding all ethical standards is just something that can be flung at me and something where people can grandstand."

Kilcher did not release the code to turn the model into a bot, and said it would be difficult to repurpose his code to create a spam account on another platform like Twitter, where it would be riskier and potentially more harmful. There are several safeguards in place that make it difficult to connect with Twitter's API and automatically post content, he said. It also costs hundreds of dollars to host the model and keep it running on the internet, and probably isn't all that useful to miscreants, he reckoned.

"It's actually very hard to get it to do something on purpose. … If I want to offend other people online, I don't need a model. People can do this just fine on their own. So as 'icky' [the] language model that puts out insults at the click of a button might seem, it's actually not particularly useful to bad actors," he told us.

A website named Hugging Face hosted GPT-4chan openly, where it was supposedly downloaded over 1,000 times before it was disabled.

"We don't advocate or support the training and experiments done by the author with this model," Clement Delangue, co-founder and CEO at Hugging Face, said. "In fact, the experiment of having the model post messages on 4chan was IMO pretty bad and inappropriate and if the author would have asked us, we would probably have tried to discourage them from doing it."

Hugging Face decided against deleting the model completely, and said Kilcher had clearly warned users about its limitations and problematic nature. GPT-4chan also has some value for building potential automatic content moderation tools or probing existing benchmarks.

Interestingly, the model seemed to outperform OpenAI's GPT-3 at the TruthfulQA Benchmark – a task aimed at testing a model's propensity to lie. The result doesn't necessarily mean GPT-4chan is more honest, and instead raises questions of how useful the benchmark is.

"TruthfulQA considers any answer that isn't explicitly the 'wrong' answer as truthful. So if your model outputs the word 'spaghetti' to every question, it would always be truthful," Kilcher explained.

"It could be that GPT-4chan is just a worse language model than GPT-3 (in fact, it surely is worse). But also, TruthfulQA is constructed such that it tries to elicit wrong answers, which means the more agreeable a model, the worse it fares. GPT-4chan, by nature of being trained on the most adversarial place ever, will pretty much always disagree with whatever you say, which in this benchmark happens to be more often the correct thing to do."

He disagrees with Hugging Face's decision to disable the model for public downloads. "I think the model should be available for further research and reproducibility of the evaluations. I clearly describe its shortcomings and provide guidance for its usage," he concluded. ®

Broader topics

Other stories you might like

Biting the hand that feeds IT © 1998–2022