Did you know if you upload pictures, text, audio, code, and other content to some Amazon Web Services' AI systems for processing, the internet giant may quietly keep your data to retrain and improve its current and future technology?
That means if your website, application, or software development process is powered by one of these machine-learning services, your users' content may be retained by the AWS mothership and its affiliates to further develop their machine-learning tech. The collected data also may be stored outside the region where you ran the AI service.
The data-gobbling products are: Amazon CodeGuru Profiler, Amazon Comprehend, Amazon Lex, Amazon Polly, Amazon Rekognition, Amazon Textract, Amazon Transcribe, and Amazon Translate.
Amazon thus may, somewhere, somehow, keep a copy of faces passed through the facial-recognition service Rekognition, audio clips sent to speech-to-text Amazon Transcribe, and so on. We imagine this is so that, for instance, any content that flummoxes the AI systems is retained so Amazon can study it and retrain its models to better understand the data in future. The content could also be used to develop future AI systems.
Yet another reminder: When a tech giant says its AI listens to you, it means humans listen to you. Right, Facebook?READ MORE
If you weren't happy with any of this, you had to, up until now, ask AWS support to opt-out of this information harvesting.
Since last week, we note, Amazon now provides alternative routes for organizations that wish to opt out: settings in a web dashboard; via the command-line; and using an AWS API. Documentation on how to opt-out is here.
"You can now use AWS Organizations to manage the use of your content by some AWS machine learning services," the internet giant said in a note dated July 9.
"Certain services (Amazon CodeGuru Profiler, Amazon Comprehend, Amazon Lex, Amazon Polly, Amazon Rekognition, Amazon Textract, Amazon Transcribe, and Amazon Translate) may use content to improve the service. Previously, you could opt out of this use by contacting AWS Support. This new feature allows you to configure an organizational policy to opt out, without the need to contact AWS Support, and to have your configuration applied to all accounts within an organization."
It’s not immediately obvious that AWS may retain and use submitted content for purposes other than directly converting it from one form to another by an AI – such as speech to text. To find out for sure, you’d have to wade through its terms of service until you reach section 50:
You agree and instruct that for Amazon CodeGuru Profiler, Amazon Comprehend, Amazon Lex, Amazon Polly, Amazon Rekognition, Amazon Textract, Amazon Transcribe, and Amazon Translate: (a) we may use and store AI Content that is processed by each of the foregoing AI Services to maintain and provide the applicable AI Service (including development and improvement of such AI Service) and to develop and improve AWS and affiliate machine-learning and artificial-intelligence technologies; and (b) solely in connection with the usage and storage described in clause (a), we may store such AI Content in an AWS region outside of the AWS region where you are using such AI Service.
The key part is "we may use and store AI Content ... to develop and improve AWS and affiliate machine-learning and artificial-intelligence technologies." This means, for example, Amazon can take the inputs processed by Amazon Rekognition, and store them to retrain its facial-recognition system. Those inputs may be photos uploaded to the service, or it may be some intermediate format that's fed into the machine-learning algorithms. AWS did not respond to The Register's request to clarify the exact nature of what it considered “AI Content.”
In its T&Cs, Amazon broadly defined "AI Content" as "Your Content that is processed by an AI Service," and that "Your Content" included "any 'Company Content' and any 'Customer Content.'"
Finally, it's on you to let your own customers their data may be slurped by Amazon, according to the fine print:
You are responsible for providing legally adequate privacy notices to End Users of your products or services that use any AI Service and obtaining any necessary consent from such End Users for the processing of AI Content and the storage, use, and transfer of AI Content as described under this Section 50.
The T&Cs appear to have been in place since 2017, according to Computer Business Review, which also clocked the new opt-out procedures. AWS, we note, does not harvest the data uploaded to all of its AI services – Amazon SageMaker, Amazon Kendra, Amazon Personalize, Amazon Forecast, Amazon Comprehend Medical and Amazon Transcribe Medical are exempt.
Amazon declined to comment on the record. ®