Could OpenAI's 'too dangerous to release' language model be used to mimic you online? Yes, says this chap: I built a bot to prove it

Facebook convos used to train chat dopey doppelganger

A machine-learning software engineer has trained OpenAI’s too-dangerous-to-release language model on personal Facebook messages to show how easy it is to create a bot that can attempt to impersonate you.

Last month, researchers at OpenAI revealed they had built software that could perform a range of natural language tasks, from machine translation to text generation. Some of the technical details were published in a paper, though the majority of materials was withheld for fear that it could be used maliciously to create spam-spewing bots or churn out tons of fake news. Instead, OpenAI released a smaller and less effective version nicknamed GPT-2-117M.

Since then, crafty techies have tinkered with the model to develop different tools. Some are beneficial like the detector that tries to determine if a chunk of text was written by machines or humans. Others are more malicious, however, like this bot that generates fake conversations, created by Svilen Todorov, a machine-learning engineer who previously worked at, a data startup based in Berlin, Germany.

Meet the chatbot

Todorov this month published a guide on how to train OpenAI’s GPT-2-117M model on your own Facebook messages. He downloaded about 14MB of his own conversations with people he had kept in touch with over a few years. GPT-2-117M picked up on the idiosyncratic details of how he typed and even some of the emojis he tended to use. The output wasn't particularly coherent, though the training data was fairly limited.

“I was somewhat impressed: my personal messages don't make up that much data so it was clear the model wouldn't get too amazing,” he told The Register.

“It was still impressive how much person-specific things it picked up, such as talking about sleep problems when generating conversations with people who have them, mentioning plans to go to the bar with people I go out with, and remembering relevant places. On the other hand, it is obvious that it makes a lot of mistakes but with more data and the bigger model, those can be reduced dramatically.”

Here’s a snippet of a fake conversation completely fabricated by the system. Todorov types in “Anna Gaydukevich: hi” as the first line to get the model started and it filled in the rest of the conversation. Note, this is a one-way chat – the AI algorithms generate both sides of the chatter.


A fake conversation generated by GPT-2-117M between people.

It doesn’t make much sense on its own, but it captures some of the mundane aspects of messaging someone about your day. Remember, this is from the limited model, though, and not the withheld version. OpenAI refused to release the full GPT-2 model to prevent people from using the tool to “generate deceptive, biased, or abusive language at scale.” And it looks like their concerns are justified, according to Todorov.

“Generating fake conversations can definitely be used maliciously and there are scams out there already that can be improved by using a better algorithm like GPT-2," he explained. "Scammers are always improving their techniques and getting better tools so it is somewhat hard to tell if something like this model makes a big difference or not, but can it be used maliciously – undoubtedly.”

Imagine a scenario where a bot trained on your own conversations could make a convincing fake profile to send things like spam or dodgy links to your friends online, using phrases and in-jokes you're known to use to hoodwink them into believing it's really you. It would be easier to fall for these types of tricks if they came from people you’re more likely to trust, he suggested.

“That is, indeed, very doable and I personally hope that the public gets better informed quickly on the possibilities because I can easily see more and more people falling for scams like this as algorithms become more sophisticated," Todorov said.

"However, it is also important to note that even with the actual state of the art, something like this won't fool everyone - but for example [what if it’s just 10 per cent of people? [It’s] certainly possible and already a reason for worry.”

A whole host of nasty apps

Todorov explained to El Reg that he could imagine miscreants using something like GPT-2 to talk to the elderly pretending to be one of their grandchildren in desperate need of cash. Or disguising as a human online only to offer fake webcam videos using pre-recorded footage.

The code would have to be altered to generate the replies as it progressed, from words typed in by the conversation partner, rather than emit a wall of made-up chatter, of course.


Nonprofit OpenAI looks at the bill to craft a Holy Grail AGI, gulps, spawns commercial arm to bag investors' mega-bucks


These types of attacks aren’t particularly difficult to launch either. Many tools like the GPT-2-117 model are open source and all it takes is dedication and some experience with coding.

“I wouldn't be surprised if people are already attempting to use this smaller model in more malicious ways, and if the bigger model was released those people would only be more successful – but again, how much more successful is somewhat hard to tell,” he said.

“It might be that [the full model] wouldn't make much of a difference but there is definitely a chance that releasing it would be disruptive. Also, I definitely agree that it is time to think about it, as models are only getting better and even if it turns out that it would've been fine to release this one, maybe it wouldn't be when it comes to the next one.

“Keeping things unreleased hinders research and I definitely would've loved to play with the full thing. So yes, this has frustrated a lot of people in the AI community but I personally think that slowing the pace of this specific type of research just slightly while the public gets informed might actually be a good thing, too. Or at least, again, it is hard to tell if it is or isn't and leaning on the side of caution seems appropriate."

To us, the text appears to be obviously machine generated and lacking clear coherency, though it is using a limited dataset and the limited model. The full version, and a fuller dataset, may be more human-like, perhaps hence why OpenAI decided to keep a lid on it.

"One of the exciting things about work like this is it demonstrates the broad capabilities of general language models like GPT-2," a spokesperson for OpenAI told The Reg. "We're studying uses like this carefully as we think about our release strategy for larger models." ®

Other stories you might like

  • SpaceX Starlink sat streaks now present in nearly fifth of all astronomical images snapped by Caltech telescope

    Annoying, maybe – but totally ruining science, no

    SpaceX’s Starlink satellites appear in about a fifth of all images snapped by the Zwicky Transient Facility (ZTF), a camera attached to the Samuel Oschin Telescope in California, which is used by astronomers to study supernovae, gamma ray bursts, asteroids, and suchlike.

    A study led by Przemek Mróz, a former postdoctoral scholar at the California Institute of Technology (Caltech) and now a researcher at the University of Warsaw in Poland, analysed the current and future effects of Starlink satellites on the ZTF. The telescope and camera are housed at the Palomar Observatory, which is operated by Caltech.

    The team of astronomers found 5,301 streaks leftover from the moving satellites in images taken by the instrument between November 2019 and September 2021, according to their paper on the subject, published in the Astrophysical Journal Letters this week.

    Continue reading
  • AI tool finds hundreds of genes related to human motor neuron disease

    Breakthrough could lead to development of drugs to target illness

    A machine-learning algorithm has helped scientists find 690 human genes associated with a higher risk of developing motor neuron disease, according to research published in Cell this week.

    Neuronal cells in the central nervous system and brain break down and die in people with motor neuron disease, like amyotrophic lateral sclerosis (ALS) more commonly known as Lou Gehrig's disease, named after the baseball player who developed it. They lose control over their bodies, and as the disease progresses patients become completely paralyzed. There is currently no verified cure for ALS.

    Motor neuron disease typically affects people in old age and its causes are unknown. Johnathan Cooper-Knock, a clinical lecturer at the University of Sheffield in England and leader of Project MinE, an ambitious effort to perform whole genome sequencing of ALS, believes that understanding how genes affect cellular function could help scientists develop new drugs to treat the disease.

    Continue reading
  • Need to prioritize security bug patches? Don't forget to scan Twitter as well as use CVSS scores

    Exploit, vulnerability discussion online can offer useful signals

    Organizations looking to minimize exposure to exploitable software should scan Twitter for mentions of security bugs as well as use the Common Vulnerability Scoring System or CVSS, Kenna Security argues.

    Better still is prioritizing the repair of vulnerabilities for which exploit code is available, if that information is known.

    CVSS is a framework for rating the severity of software vulnerabilities (identified using CVE, or Common Vulnerability Enumeration, numbers), on a scale from 1 (least severe) to 10 (most severe). It's overseen by, a US-based, non-profit computer security organization.

    Continue reading

Biting the hand that feeds IT © 1998–2022