Privacy laws covering the use of personally-identifiable information are, on a global scale, creating a dizzying patchwork of cookie-cutter cookie-serving companies created to sniff our Web browsing.
That's one conclusion of research led by Marjan Falahrastegar at Queen Mary University, London. The group, which included collaborators from the Qatar Research Computing Institute and the University of Nottingham, has undertaken the job of untangling the spaghetti-like trails left by cookies and published their work at Arxiv, here.
It won't surprise anyone to know that global sites dominate the cookie landscape: third-party cookie shippers owned by Google, Amazon and Facebook are distributed pretty evenly around the world, while Google, Yahoo! and AOL own between them the largest number of third-party cookie services around the world.
Google has more than 40 domains serving third-party cookies and Verisign has 27, the research states, while Microsoft has 19, AOL has 18 and Yahoo! a mere 15.
Users familiar with how the Web works won't be surprised to learn that a “Google” cookie might come from Google or Blogger, Doubleclick or YouTube, Googleapis.com or 2mdn.net – but to the non-initiated, the multiplication of their tracked avatars across so many databases might be news.
The picture is muddied by regional variations, and this is where privacy laws come into play: some trackers, the research states, are focussed on particular countries or regions (presumably since they have privacy laws that demand the data be kept onshore). In other cases, though, trackers live far away from the people they're tracking.
It's probably unsurprising, for example, that the distribution of third-party tracking cookies is pretty even across Europe, East Asia, Oceania and South America. On the other hand, the number of third parties taking cookie data from users in Israel and Turkey is much higher. There's also a high prevalence of third party services in the US, Germany and Russia, and US cookies are scattered all over the Middle East.
“Given the differences in regulatory regimes between jurisdictions, we believe this analysis sheds light on the geographical properties of this ecosystem and on the problems that these may pose to our ability to track and manage the different data silos that now store personal data about us all,” the researchers note.
The dataset was put together using a bunch of PlanetLab nodes so as to cover 28 countries. This service allowed them to access local websites in the target countries as though they were local browsers. The Etags Firefox extension with a bit of extra code was used to log the cookies, and the researchers wrote scripts to create unique user profiles, so as to prevent pollution of the cookie data from repeat visits to sites. ®