This article is more than 1 year old
That marketing email database that exposed 809 million contact records? Maybe make that two-BILLION-plus?
'This is a gigantic amalgamation of data all in one place' expert tells El Reg
Updated An unprotected MongoDB database belonging to a marketing tech company exposed up to 809 million email addresses, phone numbers, business leads, and bits of personal information to the public internet, it emerged yesterday.
Today, however, it appears the scope of that security snafu may have been underestimated.
According to cyber security biz Dynarisk, there were four databases exposed to the internet – rather than just the one previously reported – bringing the total to potentially more than two billion records weighing in at 196GB rather than 150GB.
Anyone knowing where to look on the 'net would have been able to spot and siphon off all that data, without any authentication.
"There was one server that was exposed to the web," Andrew Martin, CEO and founder of DynaRisk, told The Register on Friday. "On this server were four databases. The original discovery analysed records from
mainEmailDatabase. The additional three databases were hosted on the same server, which is no longer accessible.
"Our analysis was conducted over all four databases and extracted over two billion email addresses which is more than the 809 million first discussed."
The databases were operated by Verifications.io, which provides enterprise email validation – a way for marketers to check that email addresses on their mailing lists are valid and active before firing off pitches. The Verifications.io website is currently inaccessible.
The database first reported included the following data fields, some of which, such as date of birth, qualify as personal information under various data laws:
- Email Records (emailrecords): a JSON object with the keys id, zip, visit_date, phone, city, site_url, state, gender, email, user_ip, dob, firstname, lastname, done, and email_lower_sha265.
- Email With Phone (emailWithPhone): No example provided but presumably a JSON object with the two named attributes.
- Business Leads (businessLeads): a JSON object with the keys id, email, sic_code, naics_code, company_name, title, address, city, state, country, phone, fax, company_website, revenue, employees, industry, desc, sic_code_description, firstname, lastname, and email_lower_sha256.
The image below shows Verifications.io's four MongoDB databases exposed to the internet, as identified by Dynarisk:
Martin said the severity of the security blunder is less than some may fear because there are no credit card numbers, medical records, nor any other bits of super-sensitive information involved.
"The issue here is this is a gigantic amalgamation of data all in one place," he explained. "The leaking of this information may breach data protection regulations in various countries. The leak may also violate the privacy and security provisions between Verifications.io and their clients within their contracts."
Bob Diachenko, a security researcher for consultancy Security Discovery, found the first Verifications.io database online, and said the marketing tech biz, based in Tallinn, Estonia, acknowledged the gaffe and hid the data silos from public view after he flagged it up.
Verifications.io told Diachenko that its company database was "built with public information, not client data." This suggests at least some of email addresses and other details in the company's databases were downloaded or scraped from the internet.
Diachenko didn't immediately respond to a request for comment.
Amazon tries to ruin infosec world's fastest-growing cottage industry (finding data-spaffing S3 storage buckets)READ MORE
Security researcher Troy Hunt, who maintains the HaveIBeenPwned database of email accounts that have been exposed in online data dumps, said about a third of the email addresses in the Verifications.io database are new to HaveIBeenPwned. The other two thirds presumably were culled from the same online sources that supplied Hunt's archives.
Martin said Verifications.io's claim that its data came from public sources is open to interpretation. "These data sources might have been public at one time in the past and then not public at a later time," he said. "It would be interesting to know if the company had a process of continuous compliance where they would validate if they were still allowed to store the data over time."
Dtex, a security biz that focuses on the dangers of rogue or slipshod employees within businesses, said in its recent 2019 Insider Threat Intelligence Report that 98 per cent of incidents involving data left exposed in the cloud can be attributed to human error.
MongoDB versions prior to 2.6.0, released in 2014, were network accessible by default. Reversing that default setting hasn't persuaded people to securely configure their MongoDB installations, though. Out of the box, MongoDB requires no authentication to access, a detail a lot of folks appear to overlook. ®
Updated to add
Vinny Troia, who stumbled upon the exposed Verifications.io data along with Diachenko, maintains roughly 810 million netizens were exposed by the misconfigured MongoDB installation.
Dynarisk, meanwhile, told us it counted up more than two billion records from all the databases, and, after further analysis, identified a total of 999 million unique email addresses.