This article is more than 1 year old
Reminder: When a tech giant says it listens to your audio recordings to improve its AI, it means humans are listening. Right, Skype? Cortana?
Opt-in translations feature hands chats to contractors to fix up. Redmond says it's covered by fine print
If you use Skype's AI-powered real-time translator, brief recordings of your calls may be passed to human contractors, who are expected to listen in and correct the software's translations to improve it.
That means 10-second or so snippets of your sweet nothings, mundane details of life, personal information, family arguments, phone sex, and other stuff discussed on Skype sessions via the translation feature may be eavesdropped on by strangers, who check the translations for accuracy and feed back any changes into the machine-learning system to retrain it.
An acknowledgement that this happens is buried in an FAQ for the translation service, which states:
To help the translation and speech recognition technology learn and grow, sentences and automatic transcripts are analyzed and any corrections are entered into our system, to build more performant services.
Microsoft reckons it is being transparent in the way it processes recordings of people's Skype conversations. Yet one thing is missing from that above passage: humans. The calls are analyzed by humans. The more technological among you will have assumed living, breathing people are involved at some point in fine-tuning the code and may therefore have to listen to some call samples. However, not everyone will realize strangers are, so to speak, sticking a cup against the wall of rooms to get an idea of what's said inside, and so it bears reiterating.
Especially seeing as sample recordings of people's private Skype calls were leaked to Vice, demonstrating that the Windows giant's security isn't all that. "The fact that I can even share some of this with you shows how lax things are in terms of protecting user data," one of the translation service's contractors told the digital media monolith.
It's not clear right now when snippets of Skype calls are passed along to humans to analyze; presumably when the real-time translator is unable to parse a sentence, or a bungled translation is flagged up, and needs attention from a freelancer. We've asked Microsoft to clarify.
The translation contractors use a secure and confidential website provided by Microsoft to access samples awaiting playback and analysis, which are, apparently, scrubbed of any information that could identify those recorded and the devices used. For each recording, the human translators are asked to pick from a list of AI-suggested translations that potentially apply to what was overheard, or they can override the list and type in their own.
Also, the same goes for Cortana, Microsoft's voice-controlled assistant: the human contractors are expected to listen to people's commands to appraise the code's ability to understand what was said. The Cortana privacy policy states:
When you use your voice to say something to Cortana or invoke skills, Microsoft uses your voice data to improve Cortana’s understanding of how you speak.
Buried deeper in Microsoft's all-encompassing fine print is this nugget (with our emphasis):
We also share data with Microsoft-controlled affiliates and subsidiaries; with vendors working on our behalf; when required by law or to respond to legal process; to protect our customers; to protect lives; to maintain the security of our products; and to protect the rights and property of Microsoft and its customers.
In a statement, a Microsoft spokesperson told us on Wednesday:
Microsoft collects voice data to provide and improve voice-enabled services like search, voice commands, dictation or translation services. We strive to be transparent about our collection and use of voice data to ensure customers can make informed choices about when and how their voice data is used. Microsoft gets customers’ permission before collecting and using their voice data.
We also put in place several procedures designed to prioritize users’ privacy before sharing this data with our vendors, including de-identifying data, requiring non-disclosure agreements with vendors and their employees, and requiring that vendors meet the high privacy standards set out in European law. We continue to review the way we handle voice data to ensure we make options as clear as possible to customers and provide strong privacy protections.
Again, not mentioning explicitly that humans are involved in analyzing private recordings from Cortana and the Skype translator is arguably not that transparent. Also, if audio samples can be leaked to journalists, we question these "high privacy standards."
Separately, spokespeople for the US tech titan claimed in an email to El Reg that users' audio data is only collected and used after they opt in, however, as we've said, it's not clear folks realize they are opting into letting strangers snoop on multi-second stretches of their private calls and Cortana commands. You can also control what voice data Microsoft obtains, and how to delete it, via a privacy dashboard, we were reminded.
In short, Redmond could just say flat out it lets humans pore over your private and sensitive calls and chats, as well as machine-learning software, but it won't because it knows folks, regulators, and politicians would freak out if they knew the full truth.
This comes as Apple stopped using human contractors to evaluate people's conversations with Siri, and Google came under fire in Europe for letting workers snoop on its smart speakers and assistant. Basically, as we've said, if you're talking to or via an AI, you're probably also talking to a person – and perhaps even the police. ®
Additional reporting by Richard Speed.