The truth about Dropbox opening up your files to AI – and the loss of trust in tech
'Your info won't be harvested for training' is the new 'Your private chatter won't be used for ads'
Comment Cloud storage biz Dropbox spent time on Wednesday trying to clean up a misinformation spill because someone was wrong on the internet.
Through exposure to the social media echo chamber, various people – including Amazon CTO Werner Vogels – became convinced that Dropbox, which introduced a set of AI tools in July, was by default feeding OpenAI, maker of ChatGPT and DALL•E 3, with user files as training fodder for AI models.
Vogels and others advised Dropbox customers to check their settings and opt out of allowing third-party AI services to access their files. For some people, this setting appeared to be opt in; for others, opt out. No explanation was offered by Dropbox.
Artist Karla Ortiz and celeb Justine Bateman, who like Vogels have significant social media followings, each publicly condemned Dropbox for seemingly automatically, by default, allowing outside AI outfits to drill into people's documents.
It was not an implausible scenario, given that tech firms tend to make opt-in the default and OpenAI has refused to disclose its models' training data. The Microsoft-backed machine-learning super lab, for those who haven't been following closely, has been sued by numerous artists, writers, and developers for allegedly training its models on copyrighted content without permission. To date, some of those disputes remain unresolved while others have been thrown out.
While there's widespread outrage among content creators about AI models trained without permission on their work, OpenAI and backers like Microsoft have bet – by offering to indemnify customers using AI services – that they'll prevail in court, or at least make enough money to shrug off potential damages.
It's a bet that YouTube won. The video sharing site made its name distributing copyrighted clips that its users uploaded. Sued by Viacom for massive copyright infringement in 2007, YouTube escaped liability through the Digital Millennium Copyright Act.
- Four more months of Section 702 snooping slipped into $890B US defense budget bill
- Google pencils in limited third-party cookie purge for January
- Privacy crusaders accuse X of ad-targeting that flouts EU rules
- FCC reminds US mobile carriers that customer data needs to be protected
In any event, Dropbox CEO Drew Houston had to set Vogels straight, responding to the Amazonian's post by writing: "Third-party AI services are only used when customers actively engage with Dropbox AI features which themselves are clearly labeled …
"The third-party AI toggle in the settings menu enables or disables access to DBX AI features and functionality. Neither this nor any other setting automatically or passively sends any Dropbox customer data to a third-party AI service."
In other words, the setting is off until a user chooses to integrate an AI service with their account, which then flips the setting on. Switching it off cuts off access to those third-party machine-learning services.
Even so, Houston conceded Dropbox deserved blame for not communicating with its customers more clearly.
Vogels, however, insisted otherwise. "Drew, this error is completely on me," he wrote. "I was pointed at this by some friends, and with confirmation bias, I drew the wrong conclusion. Instead I should [have] connected with you asking for clarification. My sincere apologies."
That could have been the end of it, but for one thing: as noted by developer Simon Willison, many people no longer trust what big tech or AI entities say. Willison refers to this as the "AI Trust Crisis," and offers a few suggestions that could help – like OpenAI revealing the data it uses for model training. He argues there's a need for greater transparency.
That is a fair diagnosis for what ails the entire industry. The tech titans behind what's been referred to as "Surveillance Capitalism" – Amazon, Google, Meta, data gathering enablers and brokers like Adobe and Oracle, and data-hungry AI firms like OpenAI – have a history of opacity with regard to privacy practices, business practices, and algorithms.
To detail the infractions through years – the privacy scandals, lawsuits, and consent decrees – would take a book. Recall that this is the industry that developed "dark patterns" – ways to manipulate people through interface design – and routinely opts customers into services by default because they know few would bother to make that choice.
Willison concludes that technologists need to earn our trust, and asks how we can help them do that. Transparency is part of the solution – we need to be able to audit the algorithms and data being used. But that has to be accompanied by mutually understood terminology. When a technology provider tells you "We don't sell your data," that isn't supposed to mean "We let third parties you don't know build models or target ads using your data, which remains on our servers and technically isn't sold."
That brings us back to Houston's acknowledgement that "any customer confusion about this is on us, and we'll take a turn to make sure all this is abundantly clear!"
There's a lot of confusion about how code, algorithms, cloud services, and business practices work. And sometimes that's a feature rather than a bug. ®