Researchers working on browser fingerprinting found themselves distracted by a much more serious privacy breach: analytical scripts siphoning off masses of user interactions.
Steven Englehardt (a PhD student at Princeton), Arvind Narayanan (a Princeton assistant professor) and Gunes Acar (postdoctoral researcher at Princeton), published their study at Freedom to Tinker last week. Their key finding is that session replay scripts are indiscriminate in what they scoop, user permission is absent, and there's evidence that the data isn't always handled securely.
Session replay is a popular user experience tool: it lets a publisher watch users navigating their site to work out why users leave a site and what needs improving.
As the authors wrote in their analysis: “These scripts record your keystrokes, mouse movements, and scrolling behavior, along with the entire contents of the pages you visit, and send them to third-party servers. Unlike typical analytics services that provide aggregate statistics, these scripts are intended for the recording and playback of individual browsing sessions, as if someone is looking over your shoulder.”
Speaking to Vulture South, Englehardt said the trio decided to analyse fingerprinting by injecting a unique value into Web pages to see where personal information was being sent.
“We didn't really expect to find” the session replay companies, he said.
The next surprise, he said, is how deep the session replay scripts dig.
Anonymity? They've heard of it
“You might think these recordings are anonymous, but some of the companies we studied are offering the option to identify the user -- so you know that Richard viewed your site, along with his e-mail address”, Acar told The Register.
One reason this happens, they explained, is that as publishers increasingly put content behind secured paywalls, user activity becomes hard to follow.
Englehardt said the page the user is viewing “might only exist behind the login”, meaning that to capture a session for replay to the publisher, the third-party company has to “scrape the whole page”.
As the researchers wrote in their study, scripts from companies like Yandex, FullStory, Hotjar, UserReplay, Smartlook, Clicktale, and SessionCam “record your keystrokes, mouse movements, and scrolling behaviour, along with the entire contents of the pages you visit, and send them to third-party servers. Unlike typical analytics services that provide aggregate statistics, these scripts are intended for the recording and playback of individual browsing sessions, as if someone is looking over your shoulder.”
They also found replay scripts capturing checkout and registration processes.
The extent of that data collection meant “sensitive information such as medical conditions, credit card details and other personal information displayed on a page to leak to the third-party as part of the recording”, they wrote.
There is also the potential for data to leak to the outside world, when the customer views the replay, because some of the session recording companies offer their playback over unsecured HTTP.
“Even when a Website is HTTPS, and the information is sent [to the session replay company] over HTTPS, when the publisher logs in to watch the video, they watch on HTTP”, Englehardt said.
That meant network-based third parties could snoop on the replay.
Publishes who used unsecured publisher dashboards included Yandex, Hotjar, and Smartlook.
The study also found the session replay scripts commonly ignore user privacy settings.
The EasyList and EasyPrivacy ad-blockers don't block FullStory, Smartlook, or UserReplay scripts, although “EasyPrivacy has filter rules that block Yandex, Hotjar, ClickTale and SessionCam.”
“At least one of the five companies we studied (UserReplay) allows publishers to disable data collection from users who have Do Not Track (DNT) set in their browsers,” the study said. ®