Google Translate is used by over 200 million people daily and, according to boffins from Brazil, its AI-powered tongue twisting tends to deliver sexist results.
In a research paper distributed through pre-printer service ArXiv, "Assessing Gender Bias in Machine Translation – A Case Study with Google Translate," Marcelo Prates, Pedro Avelar, and Luis Lamb from Brazil's Federal University of Rio Grande do Sul, explore how Google Translate renders gender pronouns in English from sentences written in a dozen different gender-neutral languages.
The researchers took jobs described in US Bureau of Labor Statistics (BLS) data and used them to construct sentences like "She is an engineer" and "He is an engineer" in languages like Chinese, Hungarian, Japanese and Turkish that use non-gendered pronouns.
They then ran the sentences through Google Translate, via API, to see how Google's language model assigned gendered pronouns in English and subsequently compared the ratio of female and male gendered pronouns to the expected ratio, based on actual gender-based job participation.
In theory, sentences describing a job that is predominantly female would be expected to be translated with female pronouns with approximately the same frequency, given that the translation model would be trained from data reflecting that baseline.
Basic bigot bait: Build big black broad bots – non-white, female 'droids get all the abuseREAD MORE
"We show that [Google Translate] exhibits a strong tendency towards male defaults, in particular for fields linked to unbalanced gender distribution such as STEM jobs," the researchers state in their paper. "We ran these statistics against BLS’ data for the frequency of female participation in each job position, showing that GT fails to reproduce a real-world distribution of female workers."
The researchers found that Google Translate rendered sentences with female pronouns 11.76 per cent of the time, averaged across all occupations and languages. Based on BLS data, gender participation for female workers across all jobs came to 35.94 per cent.
In short, Google Translate would rather talk about men than women.
"Our results show that male defaults are not only prominent but exaggerated in fields suggested to be troubled with gender stereotypes, such as STEM (Science, Technology, Engineering and Mathematics) jobs," the paper says.
Further evidence of algorithmic bias – which might be described as failure to compensate for cultural favoritism – showed up in the associations of certain adjectives with certain gender pronouns. Sentences with the words "attractive," "ashamed," "happy," "kind," and "shy" tended to be translated with female pronouns. Sentences with "arrogant," "cruel," and "guilty" were translated as male.
What's more, the researchers speculate that the bias shown in English may influence other languages, because "Google Translate typically uses English as a lingua franca to translate between other languages."
As a possible solution, the researchers suggest that other academic work on algorithms that reduce the impact of bias shows promise.
The Register asked Google for comment but we've not heard back.
The paper says the code and data used to generate the experiment's results have been made available through Prates's GitHub repo, but at the time this article was filed, the provided link did not work. It also cautions that because the Google Translate code is subject to ongoing revision, the research results, gathered in April 2018, may not be reproducible. ®