As the sacred shopping season gets underway, the Electronic Frontier Foundation has issued a report detailing the privacy cost of surveillance-based commerce.
Issued on the Monday after the US observance of Thanksgiving, a day so known for online shopping that marketers branded the event with its own commerce-promoting moniker, "Behind the One-Way Mirror" explores the technology of corporate data gathering, specifically third-party tracking. That's when websites and applications include code that enables entities other than the website or app publisher to gather data about those interacting with the software.
"The purpose of this paper is to demystify tracking by focusing on the fundamentals of how and why it works and explain the scope of the problem," said Bennett Cyphers, EFF staff technologist and report author, in a statement.
"We hope the report will educate and mobilize journalists, policy makers, and concerned consumers to find ways to disrupt the status quo and better protect our privacy."
The problem, as the EFF sees it, is such data tends to be collected surreptitiously, without meaningful consent.
"Most third-party data collection in the US is unregulated,” said Cyphers. “The first step in fixing the problem is to shine a light, as this report does, on the invasive third-party tracking that, online and offline, has lurked for too long in the shadows."
That may seem like a quixotic quest. Since online ads first appeared in 1994, they've been served with little regard for privacy. Practices that once elicited alarm like web bugs – invisible 1-pixel images load to track online activities – have become commonplace and industries that benefit from online data collection have, for the most part, stymied serious regulation.
In the intervening years, voracious data gatherers like Facebook and Google have apologized for privacy violations and promised to do better many times, while taking action to preserve the status quo. And they're only two among the multitude of companies in the digital ad ecosystem that depend on data gathering. Despite rising regulatory scrutiny, periodic scandals, and pocket-change fines, not much has changed.
There have been some improvements in the technical defenses available to internet users, but data grabbers have deployed countermeasures to ensure the continued flow of information. Today, major companies like Adobe openly advise their customers to abuse the domain name system to ensure that they can collect marketing analytics data, even when internet users have taken action not to be tracked.
Nevertheless, the EFF's report may prove useful to help people understand the vastness of the web data landscape. Take the list of commonly used identifiers, for example.
It consists not just of widely known culprits from the web ecosystem like cookies (files set by browsers), IP addresses, TLS state, browser-based local storage cookies, and browser fingerprints, but also of data points from other networks and systems like mobile phone numbers, IMSI and IMEI numbers, advertising IDs, MAC addresses, license plate numbers, face prints, and credit card numbers, among many others.
The availability of so many identifiers means that it's extremely difficult to avoid being tracked. As Cyphers explains, "If a user clears their cookies but their IP address doesn’t change, linking the old cookie to the new one is trivial. If they block third-party cookies and use a hard-to-fingerprint browser like Safari, trackers can use first-party cookie sharing in combination with TLS session data to build a long-term profile of user behavior."
Citing data from browser maker Cliqz, the report says Google collects data on ~80 per cent of the web traffic measured, followed by Facebook (~25 per cent) and Amazon (~18 per cent).
The report also covers how data brokers operate, how real-time ad bidding works, how cookies from different services can be synced to a consistent identifier, and other aspects of the data business. It touches on privacy-protecting tools like the EFF's own Privacy Badger, content blocker uBlock Origin, the Pi-hole network filter, and the Tor Browser. And it references a variety of other reports on the subject.
Asked why the EFF is revisiting this topic now after years of minimal progress, Cyphers in an email said, "Never before has so much tracking power been concentrated in the hands of so few companies. GAFT [Google, Amazon, Facebook, and Twitter] have more data from more places that they can tie to single identities."
Cyphers is hopeful that government officials around the world may be ready, finally, to support substantive privacy rules.
"There is real momentum behind privacy legislation, both in the US and abroad, and we want to make sure lawmakers know what and how to regulate," he said.
"The tracking industry is huge and convoluted, and you can easily make rules that don't reflect the way things really work, or that play right into the hands of the biggest actors. We're trying to say, 'This problem is big, and complicated, and subtle, but it's not intractable.' We really don't want to waste the opportunity to score meaningful wins for privacy." ®