Machine needs more Learning: Google Drive dings single-character files for copyright infringement

If you're unable to share documents, this may be why


Google last month announced plans to prevent customer files stored in Google Drive from being shared when the web giant's automated scanning system finds files that violate its abuse prevention rules.

"When [a file is] restricted, you may see a flag next to the filename, you won't be able to share it, and your file will no longer be publicly accessible, even to people who have the link," Google explained at the time.

That system is now up and running, just not very well: Google Drive's scanning system has been finding copyright violations where they do not exist and flagging innocuous files.

Dr Emily Dolson, assistant professor at Michigan State University, in the departments of Computer Science & Engineering and Ecology, Evolution, & Behavior, had a run-in with the errant scanner recently when she uploaded a file named "output04.txt" that consisted of a single character, the numeral one.

One wonders what exactly upset Google – the digit or the output04.txt filename? Certainly the number 1 does turn up in all manner of copyrighted works. No one let the internet search giant know that Microsoft has its own cloud storage named OneDrive.

"I'm currently teaching a graduate-level algorithms class where students need to write code that solves problems I give them," Dolson told The Register today via email. "I like to make the test cases I use to evaluate the code freely accessible to students to assist them with debugging.

"This issue occurred when I uploaded a large set of files to Drive containing inputs and expected outputs for these test cases. Among the expected output files, there were a few that contained just the character '1'. Shortly after uploading them, I received a string of emails from Google indicating that those files had been flagged for copyright infringement."

Dolson can still access the files, but she cannot share them, which she said was unfortunate because she created them to share with her students.

Others have reported similar experiences. Richard D. Morey, a Reader (UK lingo for professor) in psychology at Cardiff University, responded to Dolson's Twitter post by noting, "I stopped using Google Drive professionally for this reason. It was flagging and pulling down documents I authored myself, and no students could access them!"

And other people responding to Dolson's post claim to have independently replicated the issue by getting small files flagged in Drive.

As has been pointed out by those participating in the Twitter discussion, Europe's GDPR gives people "the right not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her." US privacy law, however, doesn't really do much for those subject to false algorithmic decisions.

The Register twice asked Google to confirm that Drive's content flagging system is broken but we've not heard back. However, a Google engineering manager responding to Dolson's Twitter thread acknowledged that Google is aware of the issue.

Google's post announcing its Drive abuse notification system depicts a sample message that includes a button labeled "Request a Review," to have a human check the violation scanner's decision.

But Dolson said the automated email notification she received offered no way to push back against the determination of Google's content vetting algorithm – the Request a Review button was not included in the message she received, as can be seen from the screenshot she posted.

Which, you know, is a bit worrying for people concerned about the dead hand of AI being used as arbiter in these matters.

"The e-mail explicitly said 'A review cannot be requested for this restriction,'" she explained. "I do think that it is problematic to automate processes like this without providing any mechanism for a manual override.

Relying on viral social media posts as a sort of backdoor communication channel to the developers should not be the only option

"In this case it's a fairly minor inconvenience (I can just tell my students that the answer is 1), but in a different context it might be a much bigger problem. It's totally normal and understandable for software to have bugs, but that's exactly why there needs to be a mechanism for communicating those bugs back to the developers."

Dolson also took issue with allowing social media to drive customer support.

"Relying on viral social media posts as a sort of backdoor communication channel to the developers should not be the only option – that opens up a heap of equity concerns," she said. "Your ability to receive support for software products should not depend on whether you are sufficiently well connected to technology Twitter."

Netizens reported problems with other numbers, including 0, while the wags over on Hacker News pointed to a mildly relevant Onion article, headlined: "Microsoft Patents Ones, Zeroes."

Because there's always an Onion article where automation drives swathes of the IT world beyond satire. ®

Editor's note: Article updated to include quotes from Dr Emily Dolson.


Other stories you might like

  • Experts: AI should be recognized as inventors in patent law
    Plus: Police release deepfake of murdered teen in cold case, and more

    In-brief Governments around the world should pass intellectual property laws that grant rights to AI systems, two academics at the University of New South Wales in Australia argued.

    Alexandra George, and Toby Walsh, professors of law and AI, respectively, believe failing to recognize machines as inventors could have long-lasting impacts on economies and societies. 

    "If courts and governments decide that AI-made inventions cannot be patented, the implications could be huge," they wrote in a comment article published in Nature. "Funders and businesses would be less incentivized to pursue useful research using AI inventors when a return on their investment could be limited. Society could miss out on the development of worthwhile and life-saving inventions."

    Continue reading
  • Declassified and released: More secret files on US govt's emergency doomsday powers
    Nuke incoming? Quick break out the plans for rationing, censorship, property seizures, and more

    More papers describing the orders and messages the US President can issue in the event of apocalyptic crises, such as a devastating nuclear attack, have been declassified and released for all to see.

    These government files are part of a larger collection of records that discuss the nature, reach, and use of secret Presidential Emergency Action Documents: these are executive orders, announcements, and statements to Congress that are all ready to sign and send out as soon as a doomsday scenario occurs. PEADs are supposed to give America's commander-in-chief immediate extraordinary powers to overcome extraordinary events.

    PEADs have never been declassified or revealed before. They remain hush-hush, and their exact details are not publicly known.

    Continue reading
  • Stolen university credentials up for sale by Russian crooks, FBI warns
    Forget dark-web souks, thousands of these are already being traded on public bazaars

    Russian crooks are selling network credentials and virtual private network access for a "multitude" of US universities and colleges on criminal marketplaces, according to the FBI.

    According to a warning issued on Thursday, these stolen credentials sell for thousands of dollars on both dark web and public internet forums, and could lead to subsequent cyberattacks against individual employees or the schools themselves.

    "The exposure of usernames and passwords can lead to brute force credential stuffing computer network attacks, whereby attackers attempt logins across various internet sites or exploit them for subsequent cyber attacks as criminal actors take advantage of users recycling the same credentials across multiple accounts, internet sites, and services," the Feds' alert [PDF] said.

    Continue reading

Biting the hand that feeds IT © 1998–2022