Meta accuses data scrapers of taking more than their share
It's not that Facebook doesn't allow harvesting, it's more that it wasn't authorized, allegedly
Facebook parent Meta openly collects data from its billions of users, but when other companies scrape said data, it can be a problem, judging by a pair of lawsuits filed today.
Jessica Romero, Meta's Director of Platform Enforcement and Litigation, said the US tech giant has kicked off two federal lawsuits: one against scraping-for-hire company Octopus, and one against Ekrem Ateş, a Turkish individual who scraped Instagram data for use on a clone site.
Scraping involves extracting data from publicly available sources, such as profile pages, and in some cases, private data kept behind login pages. Part of the problem with companies like Octopus, Romero argued, is that they provide automated scraping services to anyone, regardless of who they may be targeting, or why, and – crucially – without permission from the source site.
Romero said Octopus is "a US subsidiary of a Chinese national high-tech enterprise that claims to have over one million customers." Its scraping software, Octoparse, is offered online and can reportedly scrape info from sites including those owned by Meta, Amazon, Twitter, Google, and LinkedIn.
According to Romero, users self-compromised their accounts when signing up for Octopus's services by handing login credentials over to the company. Octoparse was designed "to scrape data accessible to the user when logged into their accounts." Data scraped included email addresses, phone numbers, gender, birth date, post likes/comments, and more.
The lawsuit against Octopus alleges terms of service and US Digital Millennium Copyright Act violations for engaging in automated scraping without Meta's permission, along with attempting to hide its activity. Facebook is seeking a permanent injunction against Octopus to prevent its operations on any of its sites.
We've reached out to Meta to learn more about Octopus and its allegations.
- Meta's AI translation breaks 200 language barrier
- Meta: We need 5x more GPUs to combat TikTok, stat
- Meta agrees to tweak ad system after US govt brands it discriminatory
- Meta now involved in making metalevel standards for the metaverse
As for Ateş, Meta is claiming he scraped, without the web giant's blessing, the data of over 350,000 Instagram users to repost on a "clone site" called MyStalk that shows Instagram profile info and posts. Romero said Meta has taken multiple actions against Ateş since 2021, including disabling his accounts, serving him with a cease and desist, and revoking his access to Meta services.
Facebook has been scraped before. Over the course of nearly two years beginning in early 2018, a Ukrainian national named Alexander Alexandrovich Solonchenko extracted data on 178 million Facebook users. Facebook sued Solonchenko in October 2021.
Meta expanded its bug bounty program to include scraping attacks a couple of months later, but the language in the lawsuit reveals much about Meta's take on the sanctity of the data it's responsible for.
"The goal of this program is to find bugs that attackers utilize to bypass scraping limitations to access data at greater scale than the product intended," Facebook Security Engineering Manager Dan Gurfinkel said. Romero's essay echoes some of those sentiments, calling Octopus's scraping "unauthorized," not expressing dissatisfaction that it was scraping data in the first place.
Those carefully chosen words shouldn't be ignored: Meta doesn't appear to mind people scraping data from their sites – provided they do it in a way the company approves of. ®