This article is more than 1 year old

Microsoft to upgrade language translator with new class of AI model

Why use 20 systems when one will do

Microsoft is replacing at least some of its natural-language processing systems with a more efficient class of AI model.

These transformer-based architectures have been named "Z-code Mixture of Experts." Only parts of these models are activated when they're run since different parts of the model learn different tasks, unlike traditional machine learning systems that require the whole system to perform computations. As neural networks continue to grow, the Z-code model approach should prevent them from becoming too power hungry and expensive to run.

Microsoft said it has deployed these types of models for its text summarization, custom text classification, and key-phrase extraction services that are available from Azure.

Now, it's turning its attention to Translator, its online machine translation service. Translator previously required 20 models to translate between ten human languages. The same job can be performed with just a single Z-code system, running in Microsoft's cloud, we're told.

In a series of tests commissioned by Microsoft, humans judged the quality of the language translations between the old and new Translator models. Data showed the Z-code version was on average four percent better. It improved English to French translations by 3.2 percent, English to Turkish by 5.8 percent, Japanese to English by 7.6 percent, English to Arabic by 9.3 percent, and English to Slovenian by 15 percent, we're told.

Instead of explicitly training on pairs of languages, the Z-code Translator model learned how to translate between multiple languages using transfer learning, Xuedong Huang, Microsoft technical fellow and Azure AI chief technology officer, explained.

"With Z-code we are really making amazing progress because we are leveraging both transfer learning and multitask learning from monolingual and multilingual data to create a state-of-the-art language model that we believe has the best combination of quality, performance and efficiency that we can provide to our customers."

Engineers at Microsoft used GPUs to train the Z-code Translator model. "For our production model deployment, we opted for training a set of five billion parameter models, which are 80 times larger than our currently deployed models," according to Redmond's Hany Awadalla, principal research manager, Krishna Mohan, principal product manager, and Vishal Chowdhary, partner development manager.

The new model is now in production, and is powered by Nvidia's GPUs and its Triton Inference Server software, achieving up to a 27x speedup over non-optimized GPU configurations, apparently. It is currently only available for select customers who have to be approved by Microsoft first. They will be able to use the machine translation to translate text in Microsoft Word, Powerpoint, and PDF documents. ®

More about


Send us news

Other stories you might like