This article is more than 1 year old
Roses are red, this is sublime: We fed OpenAI's latest chat bot a classic Reg headline
'Not safe on database server!' it screamed back as its makers fear it could be abused for dodgy purposes
Analysis Most neural networks are like people with savant syndrome: they have extraordinary capabilities in a very narrow range of tasks.
Now a new system built by researchers at OpenAI is more of a polymath, and has learned to perform simple language tasks without human supervision nor task-specific training.
Modern neural-network development work can be tedious, with researchers scrambling to improve their benchmark scores for the same specific tasks, whether it's for image recognition accuracy or translating between different languages. Although the models may be state of the art, they are still woefully brittle and only perform well when tested using certain datasets.
For example, a computer-vision algorithm trained to search for tumors in chest X-rays probably won't be able to spot cancerous cells in other parts of the body. The lack of generalization is the bane of AI, with many boffins frustrated with the current progress in deep learning.
Doing more from one dataset
This latest research from California-based OpenAI, revealed today with partial source code, however, shows that it is possible to get systems to perform multiple tasks after they have only been trained on a single, albeit large and diverse, dataset.
We've trained an unsupervised language model that can generate coherent paragraphs and perform rudimentary reading comprehension, machine translation, question answering, and summarization — all without task-specific training: https://t.co/sY30aQM7hU pic.twitter.com/360bGgoea3— OpenAI (@OpenAI) February 14, 2019
OpenAI says it is withholding portions of the software to prevent it being abused to automatically churn out the equivalent of deep fake videos for the written word: false or misleading articles, spam and abusive messages, passages of text seemingly crafted by a particular author, and so on, that at first glance look legit.
It's believed, or at least feared, the software could be improved so that you could feed in, say, sentences from Twain or Golding, and have it spit out familiar prose from those long-dead scribes, penning words they never actually wrote.
Now extend that to fake auto-generated news reports, spam and phishing emails, online harassment, and other content crafted from people's words and articles, that miscreants can use to spread misinformation, confusion, and misery.
Whether or not this AI-produced fantasy text will fool anyone is up for debate; below is an example of the bot making up stuff from a writing prompt:
Worries over this kind of dual-use of AI – the fact the tech can be used for good and bad, and how to limit the bad – are at the top of OpenAI's mind right now.
And withholding technical details on dual-use neural networks reminds us of physicists in around the 1940s who held off publishing too much information about nuclear fission in case it gave away vital clues on how to build a viable atomic weapon.
What's inside this tech, then?
OpenAI's latest model, known as GPT-2, can carry out machine translation, reading comprehension, text summarization, question answering, and text generation after being trained on a single dataset: specifically, eight million newspaper articles and webpages scraped from Reddit links, spanning various areas of knowledge.
In other words, GPT-2 can perform so many different tasks from one training set because it learned from a gigantic lump of text from diverse sources, OpenAI's Alec Radford, a co-author of the research, explained to The Register.
Crucially, it also relies on a technique known as self-attention. GPT-2 is fed tons of sentences, and begins to learn the relationship between words, and over time patterns for grammar and syntax start to emerge.
So, when we asked the neural network a simple question like, "What is a dog?" GPT-2 answered: "According to the ASPCA and other animal welfare groups, it's a domestic dog: a large mammal generally between 10 to 28 pounds. Dogs are classified into groups, such as hunting or protection dogs. Dogs can be either female (males) or male (females). And they can also be purebred."
It has learned to recall that the word "dog" is closely associated with "animal" and throws in a few facts, such as dogs are mammals and can be bred for hunting or guarding. The example also shows that it's not entirely perfect as it seems to add in strange confusing brackets when talking about female and male dogs.
When it's asked more open-ended questions, it tends to wander off topic. We asked the model what it thought of Mark Zuckerberg, and it replied:
Has he got a future? He looks like an idiot. There are other things but the fact he can afford to have a huge yacht is going to worry me a lot. Mr Johnson said he was "not sure what he knows" about money but said he was not a "money person". He added: "I suppose he does live in a bubble."
Er, who's Mr Johnson, here?
We even fed it one of our own headlines. When presented with “Flak for Slack chaps in yak app hack flap: User database whacked,” it desperately wanted to complete the sentence with related garbled nonsense, adding:
"and not safe on database server! * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *."
Garbage in, garbage out, we suppose.
Some of the information it comes up with is just plain wrong. Jack Clark, policy director at OpenAI, told us the system could potentially be used to "generate misleading news articles or impersonate others online," hence why the organization is not releasing the full blueprints. The team feels the output can be made so coherent, it could be used convincingly for nasty purposes, from spam and phishing messages to impersonation.
For example, OpenAI's CTO and cofounder Greg Brockman demonstrated the model is more than happy to trash the idea of recycling with bogus claims when prompted...
Having bots emit more bile and nonsense like this on the internet is the last thing our inboxes and the web needs right now, whether or not you think the output is believable.
Although it can perform different tasks, GPT-2 is not an expert in any of them, and isn't nearly as good as other models that have been trained to do just one thing and one thing well. Its train of thought also tends to wander off incoherently after a few paragraphs.
A human reading it should, after a short while, realize something's up, that it is machine generated or written by someone with an infantile grasp of the language:
Here's a short story i generated using OpenAI's GPT-2 tool (prompt in bold) pic.twitter.com/DGIVwGuAUV— will knight (@willknight) February 14, 2019
GPT-2 doesn't answer questions as well as other systems that rely on algorithms to extract and retrieve information. Its translation capabilities are narrow and restricted due to the lack of a sufficient vocabulary in other languages.
Object-recognition AI – the dumb program's idea of a smart program: How neural nets are really just looking at texturesREAD MORE
What this is really useful for remains to be seen. Stephen Merity, an expert in language modelling, who was not involved with the project, said it was still interesting nonetheless. The results show that "given enough data and enough time, a language model will understand tasks that are implicit in the dataset – things it was never explicitly told to train for," he told The Register.
"Just by learning the structure of language and how terms are related to each other, you're learning a number of complex underlying tasks."
For what it's worth, the team burned through 256 Google TPU3 cores to develop the network.
The software is fascinating from an academic point of view. And it's still in its early days. However, before some people get too carried away, there's nothing that novel about GPT-2; the architecture is mostly similar to OpenAI's original GPT model, and the language modelling techniques are well known, too.
Yet it's a good steer for researchers cramming large and diverse datasets into their neural networks: bigger seems to be always better in AI. ®