Security

HTML5 may as well stand for Hey, Track Me Longtime 5. Ads can use it to fingerprint netizens

This language is wired for sound


Usenix Enigma HTML5 is a boon for unscrupulous web advertising networks, which can use the markup language's features to build up detailed fingerprints of individual netizens without their knowledge or consent.

In a presentation at Usenix's Enigma 2018 conference in California this week, Arvind Narayanan, an assistant professor of computer science at Princeton, showed how some of the advanced features of HTML5 – such as audio playback – can be used to identify individual browser types and follow them around online to get an idea of what they're into.

For example, different browsers process sound files in slightly different ways, and allowing an ad network – or any website – to potentially work out which version of a browser is being used on which operating system. Couple this with other details – such as the battery level and WebRTC – and you can start to form a fingerprint for an individual user.

Of course, your browser typically reveals its version number and the underlying operating system's details to web servers when fetching pages and other materials. However, from what Narayanan is saying, it is possible for ad networks and webmasters to bypass any attempts to suppress that information by probing the browser with HTML5 for traceable details. It also means that dumping JavaScript and cookies, and relying on purely HTML5, won't mean you're completely free from online tracking by advertisers.

“HTML5 browsers use a library to do audio processing, but different software stacks produce a unique fingerprint in combination with other data,” he explained. “Similar techniques also work on the battery and WebRTC functions.”

Fingerprint ... Each browser type has its own way of processing audio that makes it easy to track, according to this slide by Arvind Narayanan

Narayanan and his team have been monitoring the behavior of ad trackers for years. In 2014, they discovered 5,000 of the world's top 100,000 most-visited websites were, in one way or another, using a canvas fingerprinting technique to identify and follow netizens around the internet, as they moved from page to page, site to site, without their knowledge.

Further research last year found that ad networks were using session replay scripts, which he described as “analytics on steroids,” to stalk people online. Narayanan said he and his team found ad trackers on 8,000 websites leaking visitors' information in this way – including code on the website of American pharmacy chain Walgreens, which apparently handed confidential patient records to advertisers via forms, as well as the Gradescope assignment-grading software used by Princeton.

“This [session replay technique] left website owners and users pissed off,” he said. “Once we detailed the technique, the largest ad tracking providers stopped doing it. It seems sunlight is a great disinfectant.”

But this scrutiny only works up to a point, he warned. Netizen-tracking firms aren’t going to stop following people around the 'net and working out what interests them so they can be served targeted adverts and special offers. Narayanan was one of the team overseeing the now-imploded Do Not Track browser feature, and the ad industry was adamant: if 15 per cent or more of internet users turned tracking off, the banner networks would refuse to play ball and track them anyway.

Technical workarounds by ad blockers, such as Privacy Badger and Ghostery, are of some use, he said. But they are usually playing catch up with ad trackers, not blocking them from the start.

The only way this is going to stop is if web browser programmers step up and build in measures to curb the ability to stalk users. But Narayanan said browser makers don’t want to get involved.

“Historically, web browsers consider it’s not their problem. Vendors are attempting to be neutral on this, and leave it to users to sort out,” he said. “To users that’s like an email provider saying that they are neutral on spam. Protection of privacy is a core reason for user choice.”

There have been some encouraging moves. The Brave browser has been developed specifically to neuter naughty advertising trackers, and both Firefox and Safari are making more of an effort in this area, he said. Chrome is also, we note, making noises in that direction.

But what’s needed is a fundamental rethink, with features that ensure tracking-free browsing, just as private browsing doesn’t record session data on a local workstation. Some kind of warning, similar to the HTTPS icon, would also be useful.

It’s important that these anti-surveillance techniques are implemented, he said, because privacy is vital to society – and there’s plenty of evidence showing a lack of privacy stifles debate. “Privacy is a lubricant that allows for social adaptability,” Narayanan opined. “If we move to a state of pervasive surveillance we lose that mobility.” ®

Send us news
47 Comments

Mobile ad world drama: AppLovin not lovin' short seller assault claiming fraud

A peek behind the curtain in one corner of online advertising

Chrome to patch decades-old flaw that let sites peek at your history

After 23 years, the privacy plumber has finally arrived to clean up this mess

Apple: Since you care about yOuR pRiVaCy, we'll train our AI on made-up emails

It's LLMs all the way down

Competition boffin launches class action against Google UK over search dominance

Alleges £5B in harm caused by Android deals, anticompetitive actions

Meta to feed Europe's public posts into AI brains again

Who said this opt-out approach is OK with GDPR, was it Llama 4, hmm?

Pharmacist accused of using webcams to spy on women in intimate moments at work, home

Lawsuit claims sick cyber-voyeurism went undetected for years, using hundreds of PCs, due to lax infosec

Google wins 1-1: Judge rules ad giant broke some antitrust law

After battle with Uncle Sam over online competition, web giant vows to appeal the bit it lost, celebrates the half it won

All right, you can have one: DOGE access to Treasury IT OK'd judge

Login green-lit for lone staffer if he’s trained, papered up, won’t pull an Elez

Privacy died last century, the only way to go is off-grid

From smartphones to surveillance cameras to security snafus, there's no escape

Oracle says its cloud was in fact compromised

Reliability, honesty, accuracy. And then there's this lot

Genetic data repo OpenSNP to self-destruct before authoritarians weaponize it

Blame the 23andMe implosion, rise in far-right govt

Judge halts DOGE's union personal data grab at OPM, Treasury, Education

Officials likely broke Privacy Act by dishing out info without consent