Misconfigured Big Data apps are leaking data like sieves

Bank and health info included in more than a petabyte of files left lying around


More than a petabyte of data lies exposed online because of weak default settings and other configuration problems involving enterprise technologies.

Swiss security firm BinaryEdge found that numerous instances of Redis cache and store archives can be accessed without authentication. Data on more than 39,000 MongoDB NoSQL databases is similarly exposed.

More than 118,000 instances of the Memcached general-purpose distributed memory caching system are also exposed to the web and leaking data, according to Binary Edge. Finally, 8,000-plus instances of Elasticsearch servers responded to probes.

BinaryEdge concludes that it found close to 1,175 terabytes (or 1.1 petabytes) of data exposed online, after looking into just four technologies as part of an online scan.

"Versions installed are quite often old and not updated, which means that, in some cases, not only is data exposed but even servers can be compromised," Binary Edge concludes in a blog post on its research. "Companies are still figuring out how to use these technologies and by default they are not secure."

Tiago Henriques, chief exec at BinaryEdge, told El Reg that the problems it identified were almost always due to misconfiguration that exposed systems on to the internet rather than inherent flaws with the software deployments themselves. Firewalls and other defensive technologies were not deployed correctly to protect servers, leaving them open for BinaryEdge and others to probe.

"We haven't contacted the developers of these technologies. However, for example in the case of Redis and even some of the other technologies the developers clearly state that these services are not meant to be directly exposed, yet organisations keep ignoring these warnings and do it anyway," Henriques explained.

Misconfigured installations were discovered in a wide range of organisations, ranging from small businesses to large top-500 companies. "Some of these technologies are used as cache servers, so its data is always changing and a multitude of client/company data can be looked at, for example, auth[entication] sessions information," BinaryEdge added.

Pressed for a better idea of the type of data exposed, Henriques offered a more detailed explanation of what the security firm found.

"We obviously didn't look at the actual data at all. However, we did do a small analysis on database/keys names. What we did with each technology was write probes that would request service status, like versions used, and database metadata, like names and sizes," he said.

"There are also a lot of usernames and passwords and also session tokens which could be used to take over active sessions. We also have databases from pharmaceutical companies, hospitals which are named 'patient' and 'doctor-list' and to finish we have banks as well, with databases named 'coin' and 'money'," he added.

In another case, a firm in the robotics industry had left files on its database such as "blueprints" and the names of projects exposed. BinaryEdge only looked at metadata pertaining to exposed files, rather than their contents. Some of these files might be honeytraps designed to divert hackers, of course, but it's hard to see this applying to anything more than a minority of cases.

BinaryEdge wants to use its research to build an "automated system that will alert companies of open technologies in their networks" which it intends to develop as a commercial service.

"We are going to warn companies for free when we do this type of publication. Business is important, but so is the safety of this data," Henriques said. "After we give them this warning, we will then offer them an optional service that we are developing called Timelines, where they can use our platform to scan and continuously monitor their perimeters." ®


Other stories you might like

  • FTC signals crackdown on ed-tech harvesting kid's data
    Trade watchdog, and President, reminds that COPPA can ban ya

    The US Federal Trade Commission on Thursday said it intends to take action against educational technology companies that unlawfully collect data from children using online educational services.

    In a policy statement, the agency said, "Children should not have to needlessly hand over their data and forfeit their privacy in order to do their schoolwork or participate in remote learning, especially given the wide and increasing adoption of ed tech tools."

    The agency says it will scrutinize educational service providers to ensure that they are meeting their legal obligations under COPPA, the Children's Online Privacy Protection Act.

    Continue reading
  • Mysterious firm seeks to buy majority stake in Arm China
    Chinese joint venture's ousted CEO tries to hang on - who will get control?

    The saga surrounding Arm's joint venture in China just took another intriguing turn: a mysterious firm named Lotcap Group claims it has signed a letter of intent to buy a 51 percent stake in Arm China from existing investors in the country.

    In a Chinese-language press release posted Wednesday, Lotcap said it has formed a subsidiary, Lotcap Fund, to buy a majority stake in the joint venture. However, reporting by one newspaper suggested that the investment firm still needs the approval of one significant investor to gain 51 percent control of Arm China.

    The development comes a couple of weeks after Arm China said that its former CEO, Allen Wu, was refusing once again to step down from his position, despite the company's board voting in late April to replace Wu with two co-chief executives. SoftBank Group, which owns 49 percent of the Chinese venture, has been trying to unentangle Arm China from Wu as the Japanese tech investment giant plans for an initial public offering of the British parent company.

    Continue reading
  • SmartNICs power the cloud, are enterprise datacenters next?
    High pricing, lack of software make smartNICs a tough sell, despite offload potential

    SmartNICs have the potential to accelerate enterprise workloads, but don't expect to see them bring hyperscale-class efficiency to most datacenters anytime soon, ZK Research's Zeus Kerravala told The Register.

    SmartNICs are widely deployed in cloud and hyperscale datacenters as a means to offload input/output (I/O) intensive network, security, and storage operations from the CPU, freeing it up to run revenue generating tenant workloads. Some more advanced chips even offload the hypervisor to further separate the infrastructure management layer from the rest of the server.

    Despite relative success in the cloud and a flurry of innovation from the still-limited vendor SmartNIC ecosystem, including Mellanox (Nvidia), Intel, Marvell, and Xilinx (AMD), Kerravala argues that the use cases for enterprise datacenters are unlikely to resemble those of the major hyperscalers, at least in the near term.

    Continue reading

Biting the hand that feeds IT © 1998–2022