Facebook’s language translation is now finally powered by several large neural networks.
The social media giant announced on Thursday it has switched to using neural networks for translating people's posts and status updates. It’s not the first outfit to make this kind of jump – Google and Microsoft upgraded their translation services to neural networks back in November, for example.
Crucially, though, we note Facebook is using a mix of convolutional neural networks and recurrent neural networks, whereas it appears Google and Microsoft are only using recurrent neural networks in production.
As the largest social media network – with roughly two billion users – Mark Zuckerberg's empire has to deal with 4.5 billion translations each day. Facebook’s old system employed phrase-based translation, where blocks of words in a sentence were converted from one language to another. It’s a clunky method that cannot translate well between languages that have different syntaxes and structures.
The new system uses mainly recurrent neural networks (RNNs) – a common approach in data science when dealing with natural language. First the words are transformed into vectors, a process known as encoding. The RNN interprets the whole sentence word-by-word and looks at the distribution of the encoded vectors, matching them up to the relevant words in the target language – also known as decoding.
An example translation from Turkish to English, comparing Facebook’s old phrase-based technology to neural machine translation
Researchers from Facebook’s Applied Machine Learning group took about a year to build the neural machine translation system, using Caffe 2 and software written in Python and C++. Support for RNNs was added to Caffe 2 as part of the translation system project, and the results of that effort will be released, we're told. Training such large neural networks, each with many layers, had to be done in batches.
The translation process has to be fast for it to be effective – or users will get impatient and shun it. Beam search is used to narrow down the possible translation options by choosing commonly used phrases.
Meanwhile, Facebook can translate English-to-French and English-to-German faster than other conversions because for those pairings, the website uses convolutional neural networks (CNN) as opposed to its bank of RNNs. These CNNs are faster than RNNs because all the words in the text can be fed as input to the system simultaneously, rather than in a specific order. Now researchers are working on expanding the CNNs to other languages.
Finally, we understand that while, say, Google has a multilingual neural network that can handle English-to-French, English-to-German, English-to-Spanish, and so on, Facebook uses individual networks for each language pairing. ®