A picture tells a 1,000 words. Here's about 750 on Facebook using pics to school AI translators

Not a great way, but an interesting way, to teach bots

Computers are getting pretty good at translating the world's languages. However, as they say, onwards and upwards. Eggheads are now trying to teach machines to do the job in a more human-like way.

“While most machine translation systems to date are trained on large parallel corpora, humans learn language in a different way: by being grounded in an environment and interacting with other humans,” Facebook's AI research team and academics at New York University set out in a paper that appeared on arXiv this month.

Thus, rather than explicitly training neural networks on pairs of languages, the team taught bots new lingo by making them play a communication game.

Here's how it worked for two bots: an English one learning Japanese, and a Japanese one learning English. The English computer player – or speaker – is given an image, such as a photo of a galaxy, and tries to describe the picture to the second Japanese player.

The second bot – or listener – is given two pictures: a target image, and what's called a distractor image, which is a picture of something else. In this example, the target image shown to the listener is the photo of the galaxy, and the distractor image is a picture of a plant. Based on the English speaker's description, the second bot has to guess which of the two pictures it has been shown – the galaxy and the plant – is the one described by the speaker. The listener isn't told which is the target picture; it has to work it out for itself.

The speaker's goal is to send a message that is both an accurate description of the target image, and helps the listener identify the correct image.

As both players take turns to be the speaker and listener, they’re trained to map the right words to the right images in two different languages. It works similarly to machine translation where neural networks learn to map corresponding words in different languages to translate text.

“It is natural to use vision as an intermediary: when communicating with someone who does not speak our language, we often directly refer to our surroundings,” the team's paper stated.

A robot

Facebook pulls plug on language-inventing chatbots? THE TRUTH


It’s a relatively simple experiment starting with two images and single words between 15 different language pairs, so it’s nowhere near as good as Google Translate nor Facebook’s translation system.

When the game is made more difficult using complete sentences rather than words in English and German, the system struggled, the researchers admitted. But the performance is slightly better when there are three players instead of two.

Now each player has to communicate with two other bots speaking two other languages. The researchers noticed that the quality of the translations between pairs of languages improves.

Douwe Kiela, a researcher at Facebook, told The Register that "this is probably because of what are called ensemble effects in machine learning: as more agents interact with each other, they learn from more diverse data, which allows them to learn faster and as it turns out, to become better at translation."


Experimenting with scenarios where multiple agents are forced to talk to get a task done is quite popular. OpenAI and Baidu both carried out similar studies to get bots to invent their own language about objects in their environment.

The results from Facebook's latest multi-agent tests show this approach of using image description passing is poor for building a translation engine, but it's an interesting technique nevertheless.

Kiela explained the study was focused on "low-resource translation."

"[It's] an interesting AI problem: we are getting pretty good at translation when there is a lot of parallel data available," he said. "Parallel means having an original sentence and a corresponding translation, but this kind of data isn’t available for a lot of language pairs.

"Low resource machine translation is still very much an open problem. Our method shows that parallel data is not strictly necessary, as long as there is an intermediate common ground, in this case in the form of images."

"It can potentially be used to improve existing translation systems, especially for low resource languages, and it can lead to new translation methods. A problem with the current method is that there are no images for abstract sentences. For example, 'Democracy is a political system' does not have corresponding images. We plan to work on that in future work." ®

Similar topics

Narrower topics

Other stories you might like

  • Meta agrees to tweak ad system after US govt brands it discriminatory
    And pay the tiniest of fines, too

    Facebook parent Meta has settled a complaint brought by the US government, which alleged the internet giant's machine-learning algorithms broke the law by blocking certain users from seeing online real-estate adverts based on their nationality, race, religion, sex, and marital status.

    Specifically, Meta violated America's Fair Housing Act, which protects people looking to buy or rent properties from discrimination, it was claimed; it is illegal for homeowners to refuse to sell or rent their houses or advertise homes to specific demographics, and to evict tenants based on their demographics.

    This week, prosecutors sued Meta in New York City, alleging the mega-corp's algorithms discriminated against users on Facebook by unfairly targeting people with housing ads based on their "race, color, religion, sex, disability, familial status, and national origin."

    Continue reading
  • Metaverse progress update: Some VR headset prototypes nowhere near shipping
    But when it does work, bet you'll fall over yourselves to blow ten large on designer clobber for your avy

    Facebook owner Meta's pivot to the metaverse is drawing significant amounts of resources: not just billions in case, but time. The tech giant has demonstrated some prototype virtual-reality headsets that aren't close to shipping and highlight some of the challenges that must be overcome.

    The metaverse is CEO Mark Zuckerberg's grand idea of connected virtual worlds in which people can interact, play, shop, and work. For instance, inhabitants will be able to create avatars to represent themselves, wearing clothes bought using actual money – with designer gear going for five figures.

    Apropos of nothing, Meta COO Sheryl Sandberg is leaving the biz.

    Continue reading
  • Facebook phishing campaign nets millions in IDs and cash
    Hundreds of millions of stolen credentials and a cool $59 million

    An ongoing phishing campaign targeting Facebook users may have already netted hundreds of millions of credentials and a claimed $59 million, and it's only getting bigger.

    Identified by security researchers at phishing prevention company Pixm in late 2021, the campaign has only been running since the final quarter of last year, but has already proven incredibly successful. Just one landing page - out of around 400 Pixm found - got 2.7 million visitors in 2021, and has already tricked 8.5 million viewers into visiting it in 2022. 

    The flow of this phishing campaign isn't unique: Like many others targeting users on social media, the attack comes as a link sent via DM from a compromised account. That link performs a series of redirects, often through malvertising pages to rack up views and clicks, ultimately landing on a fake Facebook login page. That page, in turn, takes the victim to advert landing pages that generate additional revenue for the campaign's organizers. 

    Continue reading

Biting the hand that feeds IT © 1998–2022