Could OpenAI's 'too dangerous to release' language model be used to mimic you online? Yes, says this chap: I built a bot to prove it
Facebook convos used to train chat dopey doppelganger
A machine-learning software engineer has trained OpenAI’s too-dangerous-to-release language model on personal Facebook messages to show how easy it is to create a bot that can attempt to impersonate you.
Last month, researchers at OpenAI revealed they had built software that could perform a range of natural language tasks, from machine translation to text generation. Some of the technical details were published in a paper, though the majority of materials was withheld for fear that it could be used maliciously to create spam-spewing bots or churn out tons of fake news. Instead, OpenAI released a smaller and less effective version nicknamed GPT-2-117M.
Since then, crafty techies have tinkered with the model to develop different tools. Some are beneficial like the detector that tries to determine if a chunk of text was written by machines or humans. Others are more malicious, however, like this bot that generates fake conversations, created by Svilen Todorov, a machine-learning engineer who previously worked at predict.io, a data startup based in Berlin, Germany.
Meet the chatbot
Todorov this month published a guide on how to train OpenAI’s GPT-2-117M model on your own Facebook messages. He downloaded about 14MB of his own conversations with people he had kept in touch with over a few years. GPT-2-117M picked up on the idiosyncratic details of how he typed and even some of the emojis he tended to use. The output wasn't particularly coherent, though the training data was fairly limited.
“I was somewhat impressed: my personal messages don't make up that much data so it was clear the model wouldn't get too amazing,” he told The Register.
“It was still impressive how much person-specific things it picked up, such as talking about sleep problems when generating conversations with people who have them, mentioning plans to go to the bar with people I go out with, and remembering relevant places. On the other hand, it is obvious that it makes a lot of mistakes but with more data and the bigger model, those can be reduced dramatically.”
Here’s a snippet of a fake conversation completely fabricated by the system. Todorov types in “Anna Gaydukevich: hi” as the first line to get the model started and it filled in the rest of the conversation. Note, this is a one-way chat – the AI algorithms generate both sides of the chatter.
It doesn’t make much sense on its own, but it captures some of the mundane aspects of messaging someone about your day. Remember, this is from the limited model, though, and not the withheld version. OpenAI refused to release the full GPT-2 model to prevent people from using the tool to “generate deceptive, biased, or abusive language at scale.” And it looks like their concerns are justified, according to Todorov.
“Generating fake conversations can definitely be used maliciously and there are scams out there already that can be improved by using a better algorithm like GPT-2," he explained. "Scammers are always improving their techniques and getting better tools so it is somewhat hard to tell if something like this model makes a big difference or not, but can it be used maliciously – undoubtedly.”
Imagine a scenario where a bot trained on your own conversations could make a convincing fake profile to send things like spam or dodgy links to your friends online, using phrases and in-jokes you're known to use to hoodwink them into believing it's really you. It would be easier to fall for these types of tricks if they came from people you’re more likely to trust, he suggested.
“That is, indeed, very doable and I personally hope that the public gets better informed quickly on the possibilities because I can easily see more and more people falling for scams like this as algorithms become more sophisticated," Todorov said.
"However, it is also important to note that even with the actual state of the art, something like this won't fool everyone - but for example [what if it’s just 10 per cent of people? [It’s] certainly possible and already a reason for worry.”
A whole host of nasty apps
Todorov explained to El Reg that he could imagine miscreants using something like GPT-2 to talk to the elderly pretending to be one of their grandchildren in desperate need of cash. Or disguising as a human online only to offer fake webcam videos using pre-recorded footage.
The code would have to be altered to generate the replies as it progressed, from words typed in by the conversation partner, rather than emit a wall of made-up chatter, of course.
Nonprofit OpenAI looks at the bill to craft a Holy Grail AGI, gulps, spawns commercial arm to bag investors' mega-bucksREAD MORE
These types of attacks aren’t particularly difficult to launch either. Many tools like the GPT-2-117 model are open source and all it takes is dedication and some experience with coding.
“I wouldn't be surprised if people are already attempting to use this smaller model in more malicious ways, and if the bigger model was released those people would only be more successful – but again, how much more successful is somewhat hard to tell,” he said.
“It might be that [the full model] wouldn't make much of a difference but there is definitely a chance that releasing it would be disruptive. Also, I definitely agree that it is time to think about it, as models are only getting better and even if it turns out that it would've been fine to release this one, maybe it wouldn't be when it comes to the next one.
“Keeping things unreleased hinders research and I definitely would've loved to play with the full thing. So yes, this has frustrated a lot of people in the AI community but I personally think that slowing the pace of this specific type of research just slightly while the public gets informed might actually be a good thing, too. Or at least, again, it is hard to tell if it is or isn't and leaning on the side of caution seems appropriate."
To us, the text appears to be obviously machine generated and lacking clear coherency, though it is using a limited dataset and the limited model. The full version, and a fuller dataset, may be more human-like, perhaps hence why OpenAI decided to keep a lid on it.
"One of the exciting things about work like this is it demonstrates the broad capabilities of general language models like GPT-2," a spokesperson for OpenAI told The Reg. "We're studying uses like this carefully as we think about our release strategy for larger models." ®