Microsoft leaks 6.5TB in Bing search data via unsecured Elastic server. *Insert 'Wow... that much?' joke here*

Not personal info, but there are worries over deanonymisation. Remember that AOL research database?

Microsoft earlier this month exposed a 6.5TB Elastic server to the world that included search terms, location coordinates, device ID data, and a partial list of which URLs were visited.

According to a report from cyber-security outfit WizCase, the server was password-protected until around 10 September, when "the authentication was removed," we're told.

WizCase code-prober Ata Hakcil discovered the leak on 12 September. The data appears to be generated by the Bing mobile app, which promises users "getting rewarded is easy, just search with the Bing," and has been downloaded more than 10 million times from Google's Play Store at least. The data was growing by up to 200GB per day and included searches from people in more than 70 countries, according to WizCase.

Once the data was left unsecured, several things happened. The infosec firm reported the problem to Microsoft on 13 September, and the database was vanished from public view by the Windows giant's security response centre on 16 September. That left plenty of time for hackers and bots to stumble across the data silo. WizCase said the server suffered a Meow attack on two occasions, referring to a bot which wipes unsecured databases and replaces them with new ones featuring over and over the word "meow". However, fresh telemetry from the Bing app continued to be collected in the silo. If the Meow bot found that data, it is likely that other interested parties did as well.

In mitigation, the information did not include personal details such as names, addresses or email addresses. A critical question, though, is whether enough data was included to track down individuals using the search engine.

In 2006, AOL released what it thought was anonymised search data for research purposes, though journalists soon proved this wrong by identifying some of the searchers. One of the reasons why this was easy was that each searcher was identified by a numeric key, so it was possible to see all the searches made by a particular individual and then join the dots from clues in the queries.

Leaky AWS S3 buckets are so common, they're being found by the thousands now – with lots of buried secrets


It seems Microsoft's leaked data may likewise have privacy implications. WizCase screenshots show that the records include fields called deviceID, deviceHash, AdID and clientID, all of which are promising in terms of finding all the searches from a particular user. There are also coordinates showing location "within 500 metres," not precise enough to get an address, but helpful to someone trying to identify searchers.

The data also reveals some of the unsavoury things people search for, including illegal content. WizCase suggested that if criminals succeed in deanonymising the data, some individuals could be vulnerable to blackmail or phishing scams as a result.

Statcounter readings show just 2.83 per cent market share for Bing versus Google's 92.05 per cent. That said, it is a small percentage of a very large market, and Statcounter's figures may not reflect searches via the Bing app or those integrated into Windows search.

The security blunder is unfortunate for Microsoft, which advertises "simplified privacy controls" as one of the benefits of the iOS version of Bing Search.

A Microsoft spokesperson told us: “We’ve fixed a misconfiguration that caused a small amount of search query data to be exposed. After analysis, we’ve determined that the exposed data was limited and de-identified.”

Anybody can make a mistake, but there is an implicit deal with search providers like Microsoft and Google that we get personalisation and improved search results in return for allowing them to collect data on our behaviour. A high level of trust is required, and this kind of incident is damaging to that trust. The data was, apparently, not encrypted. ®

Send us news
Get our Security newsletter

Keep Reading

AWS is now African AF as it opens Cape Town region in South Africa

Named AF-SOUTH-1 and near some handy routes to Europe and West Africa

Personal data from Experian on 40% of South Africa's population has been bundled onto a file-sharing website

August breach hadn't been cleared up at all – and regulators are furious

Experian says it recovered and deleted data on 24 million South Africans after giving it to random 'marketing' person

Credit giant admits to handing over info after 'fraudulent data enquiry'

Facebook to surround all of Africa in optical fibre and tinfoil

Plans new sub cable running from the UK to Spain on just-about the longest route possible

Amazon makes big bet on New Zealand to crack Indian market

This one's all about putting cricket behind a paywall for nine-figure audiences of ardent fans

Apple coughs $84m to settle South Korean market abuse case

Promises to support local businesses and stop forcing local carriers paying for ads and iThing fixes

Amazon Transcribe can now ID 31 languages from audio so uncultured swines don't have to

Give that tagging finger a rest

Amazon gets green-light to blow $10bn on 3,000+ internet satellites. All so Americans can shop more on Amazon

Jeff knows you've gotta spend money to make money

CSI: coming soon to a screen near you

'Counterfeit Stuff Investigation' team staffed by former federal prosecutors to go after dodgy merchants and makers

South American nations open fire on ICANN for 'illegal and unjust' sale of .amazon to zillionaire Jeff Bezos

Nastygram to DNS overseer follows long, flawed and drawn-out process

Tech Resources

Navigating the New Era of Cloud Computing

Hear from Steve Sibley, VP of Offering Management for IBM Power Systems about how IBM Power Systems can enable hybrid cloud environments that support “build once, deploy anywhere” options.

Simplifying Hybrid Cloud Flash Storage

According to industry analysts, a critical element for secure hybrid multicloud environments is the storage infrastructure.

Managed Detection and Response (MDR) Services Buyers Guide

Organizations are increasingly looking towards managed detection and response (MDR) services to run their security operations program.

IBM and Nvidia® Solutions Power Insights with the New AI

IBM is well-positioned to help organizations incorporate high-performance solutions for AI into the enterprise landscape.