This article is more than 1 year old

Google asks websites to kindly not break its shiny new targeted-advertising API

Tech tweaked ahead of rollout in July, Mozilla and Apple still not interested

Google plans to ship its Topics API when Chrome 115 arrives on July 12. That's the API that's supposed to allow advertisers to target netizens with adverts tailored to their individual interests without impinging on people's privacy.

And to help prevent those privacy problems, the ad giant is asking advertisers to promise they won't abuse this ad targeting mechanism.

Last May, Alexandre Gilotte, senior data scientist and software engineer for ad platform firm Criteo, opened a GitHub Issues discussion describing a potential fingerprinting attack on the Topics API that could be used to identify people online.

Last Thursday, with Google preparing to make Topics available next month in Chrome, Josh Karlin, technical lead and manager of Google's Privacy Sandbox project, closed the year-old discussion.

"Since this discussion, we've added a requirement on Chrome that developers enroll to use the API and to attest that they won't abuse the API," he wrote. "That's not a technical solution, but I do believe it goes a long way to addressing this problem. Closing for now."

It remains an open question, however, whether other browser makers will ever support the API. Both Firefox maker Mozilla and Safari developer Apple have indicated they oppose the Topics proposal.

We just can't see a way to make this work from a privacy standpoint

"Fundamentally, we just can't see a way to make this work from a privacy standpoint," said Mozilla distinguished engineer Martin Thomson, in January in response to a request for an official position statement from Karlin.

"Though the information the API provides is small, our belief is that this is more likely to reduce the usefulness of the information for advertisers than it provides meaningful protection for privacy. Unfortunately, it is hard to identify concrete ways in which this might be improved."

Anne van Kesteren, who works on web standards at Apple, cited ten issues with the API and declared that the iGiant is opposed to it. "We don’t think cross-site data about the user’s browsing behavior should be exposed in APIs," he said. "We've been working for ten years in the opposite direction, partitioning data per-top-level-site."

Google, having last year abandoned its previous interest-based API, Federated Learning of Cohorts (FLoC), is nonetheless moving forward with Topics because it needs something that will enable interest-based advertising once the already delayed deprecation of third-party cookies occurs in Q3 2024.

How the API works

The Topics API is one of several possibly privacy-preserving proposals for handling digital advertising once support for third-party cookies goes away. Part of what Google has been calling its Privacy Sandbox, Topics provides a mechanism for serving ads that correspond to the inferred interests of web users.

Basically, when a user visits a website and the website wants to show an ad, the website can run JavaScript code (or check the request header Sec-Browsing-Topics) to fetch a list of up to three topics, from a taxonomy of several hundred interest categories, derived from the user's past website visits. That allows the site to show an ad believed to be relevant to the visitor's known interests.

"With Topics, your browser determines a handful of topics, like 'Fitness' or 'Travel & Transportation,' that represent your top interests for that week based on your browsing history," explained Vinay Goel, product director of Privacy Sandbox at Google, last year.

"Topics are kept for only three weeks and old topics are deleted. Topics are selected entirely on your device without involving any external servers, including Google servers. When you visit a participating site, Topics picks just three topics, one topic from each of the past three weeks, to share with the site and its advertising partners."

The API occasionally may also return a random topic. In browsers supporting Topics, like the upcoming Chrome 115, a webpage invoking the API thus…

const topics = await document.browsingTopics();

…might return an array formatted thus…

[{'configVersion': String, 'modelVersion': String, 'taxonomyVersion': String, 'topic': Number, 'version': String}]

…where "Number" corresponds to a numbered taxonomy of predefined interests. The value "1" refers to "/Arts & Entertainment" while the number 277 refers to "/Jobs & Education/Education/Foreign Language Study."

Armed with that information, webpage code could then request a topic-related ad, which ideally would better engage the web visitor and generate more revenue because the advertiser would pay a premium to reach the desired audience.

Gilotte's concern is a web publisher could implement the Topics API by including the requisite JavaScript on multiple websites and then build a fingerprint identifier based on how the websites behave for the user.

The Topics API has a "witness" requirement – it only reveals a visitor's interest in a topic if the site has previously received data in that topic category. So a script on a webpage that observes a user visiting a news site could learn that the user has an affinity for news, but not that the user is interested in, say, shopping.

This rule – which Google calls its "per-caller filtering requirement" and may help Google more than smaller companies with less visibility into web visits – can be exploited to gain one bit of entropy about the visitor: whether or not the site has seen the topic.

With enough bits of entropy, you get a fingerprint – we're talking about dozens of websites over weeks of observation. According to Mozilla's Thomson, 20 bits allows for one-in-a-million differentiation. And he elaborates on his concerns in a paper [PDF] published in January titled, "A Privacy Analysis of Google’s Topics Proposal."

"We conclude that Topics has significant and structural privacy challenges that are difficult to remediate," Thomson wrote.

Google's response

In an attempt to address some of the concerns that have arisen, Karlin and others at Google argue Topics offers better privacy than third-party cookies – which don't offer much privacy. In April, he and ten colleagues released a paper [PDF] outlining the math to evaluate that claim.

And earlier this month, Google announced some changes to the Topics API.

There's a new taxonomy of 469 interest topics, up from 349 previously. This is smaller than the IAB Audience Taxonomy, which Google says contains about 1,500 topics. Some 280 commercially-focused categories like Athletic Apparel, Mattresses, and Luxury Travel were added while 160 less monetizable categories like Civil Engineering and Equestrian were removed.

"We chose to limit the taxonomy's size, to protect against re-identification risk," explained Leeron Israel, product manager for Google's Privacy Sandbox.

Google, said Israel, also plans to let users block specific topics. "This means users will be able to curate the set of available topics they are interested in by removing selected topics," he said. "This change, coming by early next year, will give users even more control over their privacy and make the Topics API even more user-friendly."

Mozilla remains unconvinced.

"We're not enthusiastic about building features that reveal people's browsing history," a company spokesperson told The Register in an email.

"Google is content to utilize low-level noise to offer a sense of privacy. Randomizing the data at a rate of one in twenty may lessen its effectiveness for advertising, but it isn't much of a consolation for those who are re-identified using that information."

Evidently, there will be an off switch. ®

More about

TIP US OFF

Send us news


Other stories you might like