Underscoring the permanence of data published on the internet, a security researcher has compiled the names and URLs of more than 100 million Facebook users and made them available as a BitTorrent download.
Ron Bowles, who describes himself as a certified penetration tester, said he used some hastily written code to harvest the names of more than 100 million users who had set their accounts to be accessible to Google and other search engines. The list also includes the unique web address to each account, meaning the pages will be accessible even if the users later configure their accounts to be private.
“Once I have the name and URL of a user, I can view, by default, their picture, friends, information about them, and some other details,” Bowles wrote in a blog post. “If the user has set their privacy higher, at the very least I can view their name and picture. So, if any searchable user has friends that are non-searchable, those friends just opted into being searched, like it or not! Oops :)”
Facebook strictly forbids the scraping of its content, so it's unclear what the consequences of Bowles's unauthorized move will be. Bowles's website at skullsecurity.org and skullsecurity.net was unavailable at time of writing for reasons that weren't clear. The researcher didn't respond to an email seeking comment.
At time of writing, this torrent indicated that almost 10,000 people had tried to download the file.
Facebook has reiterated that users can configure their accounts to be inaccessible to search engines. But as Bowles has already stated, that does nothing for those who want to remove their names after the fact.
In one sense, it's not particularly surprising that the information users have made available online might be compiled into a single file and become available elsewhere. As NewsArse.com succinctly put it, “Security experts warn that stuff you put on the Internet is on the Internet.”
But the incident also demonstrates the truism that many people on the net continue to ignore: Once something is put onto Twitter, Facebook or pretty much any other other website, it is a permanent part of the internet record. And because of the wealth of web application vulnerabilities, that is often the case even when content has been designated as private.
This is almost certainly not the first time data has been scraped from Facebook – or from Twitter, LinkedIn, and dozens of other of websites either. And it certainly won't be the last. ®