Eager to avoid the perception that it has been leafing through netizens' files – a fear it has contended with at least since it began scanning Gmail messages to inform its ad biz – Google on Thursday issued a second statement to explain why it erroneously flagged files for a small percentage of Docs and Drive users as violating its Terms of Service two days ago.
On Tuesday, the Chocolate Factory attributed the snafu to "a code push that incorrectly flagged a small percentage of Google Docs as abusive."
That evidently failed to address concerns, which isn't altogether surprising given that surveillance is the business model of the internet and that heroic efforts are required to avoid being tracked online.
As a measure of the suspicion among internet users, Rob Goldman, vice president of ads products at Facebook, recently felt obliged to deny via Twitter that Facebook eavesdrops on users through smartphone microphones. His denial recalls the insistence by Google, Facebook, and other cloud-based companies in 2013, following Edward Snowden's surveillance revelations, that they were not passing data to the US government.
Google Drive ate our homework! Doc block blamed on code blunderREAD MORE
In the shadow of that legacy of mistrust, Google on Thursday decided its explanation required further clarification.
In a blog post, Mark Risher, director of product management, acknowledged: "The blocking raised questions in the community and we would like to address those questions here."
Google Docs and Drive, he said, incorporate automated security mechanisms – static and dynamic antivirus techniques – to protect users from malware, phishing and spam.
"Virus and malware scanning is an industry best practice that performs automated comparisons against known samples and indicators," said Risher. "The process does not involve human intervention."
Got that? No humans were spying on you. Feel better now?
The bug in the pushed code caused the signals from those systems to be misinterpreted and led some users to find their files had been erroneously flagged for imagined Terms of Service violations.
Google's follow-up however raises additional questions. Static antivirus techniques suggests static code analysis, which implies the automated examination of code. Dynamic antivirus techniques generally involve attempting to execute code on a real or virtual processor.
What kind of output do these techniques produce on text documents? File hashes or something that might be reversible to obtain the original content? Could these file hashes be used to find identical files associated with other users? How are the results stored and for how long? Can the results be accessed by other Google systems or through legal demands?
We asked Google if it cares to elaborate. We'll let you know if we hear anything. ®