Millions of scraped public social net profiles left in open AWS S3 box
Poorly configured cloud buckets strike again – this time, LocalBlox fingered
US social network data aggregator LocalBlox has been caught leaving its AWS bucket of 48 million records – harvested in part from public Facebook, LinkedIn and Twitter profiles – available to be viewed by anyone who stopped by.
Security biz Upguard wandered by on February 18, and found the publicly accessible files in a misconfigured AWS S3 storage bucket located at the subdomain "lbdumps." There's no evidence that anyone else stopped by for a peek, but it's possible.
We're told the S3 bucket contained a single 151.3GB compressed representation of a 1.2TB ndjson (newline-delineated JSON) file. The database describes "tens of millions of individuals," we're told.
Upguard, in a blog post on Wednesday, said it informed LocalBlox on February 28, and the bucket was secured later that day.
Poorly configured AWS S3 buckets have been an source of shame for Amazon Web Services and its users. Last year, the cloud platform giant introduced a tool to warn customers about insecure storage setups and earlier this year made the business version of the tool free, to avoid embarrassment by association.
Still, the problem persists and the forecast continues to look bleak. Last year, Gartner research VP said Jay Heiser predicted that through 2020, "95 percent of cloud security failures will be the customer's fault."
According to Upguard, the data profiles appear to have been collected from multiple sources. They include names, street addresses, dates of birth, job histories scaped from LinkedIn, public Facebook profiles, Twitter handles, and Zillow real estate data, all linked by IP addresses.
Some of the data, the security company suggests, appears to have come from purchased databases and payday loan operators. Other data points – associated with queries like
allSentences – appear to have been scraped through searches of Facebook.
LocalBlox has posted samples of its data profiles on its website.
"The presence of scraped data from social media sites like Facebook also highlights an important fact: all too often, data held by widely used websites can be targeted by unknown third parties seeking to monetize this information," Upguard said.
Facebook CEO Mark Zuckerberg recently acknowledged "we believe most people on Facebook could have had their public profile scraped" by "malicious actors."
Zuckerberg, testifying before Congress in the wake of the Cambridge Analytica scandal, insisted Facebook users have control over their data. From this case it looks more like no one has much control over it.
LocalBlox did not immediately respond to a request for comment. ®