Google takes the PIS out of advertising: New algo securely analyzes shared encrypted data sets without leaking contents

Plus: MongoDB crams end-to-end crypto into database tech

Google on Wednesday released source code for a project called Private Join and Compute that allows two parties to analyze and compare shared sets of data without revealing the contents of each set to the other party.

This is useful if you want to see how your private encrypted data set of, say, ad-clicks-to-sales conversion rates, correlates to someone else's encrypted conversion rate data set without disclosing the actual numbers to either side.

This particular technique is a type of secure multiparty computation that builds upon a cryptographic protocol called Private Set-Intersection (PSI). Google employs this approach in a Chrome extension called Password Checkup that lets users test logins and passwords against a dataset of compromised credentials without revealing the query to the internet goliath.

Private Join and Compute, also known as Private Intersection-Sum (PIS), takes PSI further by hiding the data that represents the intersection of the two data sets and revealing only the results of calculations based on the data.

The technique is described in a research paper, "On Deploying Secure Computing Commercially: Private Intersection-Sum Protocols and their Business Applications," penned by nine Google researchers: Mihaela Ion, Ben Kreuter, Ahmet Erhan Nergiz, Sarvar Patel, Mariana Raykova, Shobhit Saxena, Karn Seth, David Shanahan, and Moti Yung.

The paper describes how PIS can be computed using three cryptographic protocols: Random Oblivious Transfer, encrypted Bloom filters, and Pohlig–Hellman double masking.


"Private Intersection-Sum is not an arbitrary question, but rather arose naturally and was concretely defined based on a given central business need: computing aggregate conversion rate (or effectiveness) of advertising campaigns," Google's researchers explain in their paper. "This problem has both great practical value and important privacy considerations, and represents a type of analysis that occurs surprisingly commonly."

As an example, Google researchers describe a scenario in which a city wants to know whether the cost of operating weekend train service is offset by increased revenues at local businesses. The city's rider data set and the point-of-sale data set from merchants can be processed using Private Join and Compute in a way that allows the city to determine the total number of train riders who made a purchase at a local store without revealing any identifying information.

SEAL up your data just like Microsoft: Redmond open-sources 'simple' homomorphic encryption blueprints


Google's researchers argue that reconciling organizations' hunger for data mining with rising interest in privacy requires security computing protocols. "Indeed, the consideration given to privacy by users and governments around the world is growing rapidly," they observe.

In an email to The Register, Mike Rosulek, assistant professor of computer science at Oregon State University in the US, explained that PSI can replace the status quo, whereby Google and another company draft a legal agreement promising to share data to understand ad campaign effectiveness, generate aggregate data, and then to dispose of each other's source data sets under contractual duress.

These PSI techniques let companies do this without the legal ritual. "With PSI there is no way to violate the 'agreement' because the cryptography literally prevents you from learning more than you are allowed," he said.

For those appearing in one of these data sets – an individual who saw a Google ad or bought an advertised product – PSI-sum computation offers a similar privacy proposition as the contract scenario, said Rosulek.

"Imagine a ghost appears to Sergey Brin in a dream and says 'people who saw this advertisement collectively spent $824,852 at Company X!'" he said. "If you feel like this ghastly vision is not a significant violation of your personal privacy, then you should be comfortable with PSI-sum, since it releases exactly the same information about you into the world."

Rosulek suggests the greatest benefit of this technology accrues to companies that would have otherwise foregone analytics altogether for fear of privacy problems.

While Google developed its technology as a privacy preserving way to attribute aggregate ad conversions, the web giant says it hopes PIS can advance research into public policy, diversity and inclusion, healthcare and vehicle safety by making secure computing more widely accessible.

At the moment, however, the code is not quite secure enough. The PIS security model envisions "honest-but-curious adversaries" and as the GitHub repo notes, "If a participant deviates from the protocol, it is possible they could learn more than the prescribed information." What's more, the protocol doesn't ensure that parties using it employ legitimate inputs or prevent arbitrary inputs. And there may be PIS leakage.

"For example, if an identifier has a very unique associated integer values, then it may be easy to detect if that identifier was in the intersection simply by looking at the intersection-sum," the GitHub repo cautions.

The code isn't officially supported by Google and comes with no guarantees. ®

Speaking of encryption... MongoDB Server 4.2 RC, unveiled at MongoDB World 2019 this week, includes a feature called client-side field level encryption. This allows clients to "selectively encrypt individual document fields, each optionally secured with its own key and decrypted seamlessly on the client," according to the software's maker.

This ensures data is encrypted by a client before it is sent to the database to store, and decrypted by the client when it is fetched, providing end-to-end encryption. Whoever is hosting the MongoDB database cannot decipher the data, therefore, because only the client, ideally, has the necessary keys.

Similar topics

Other stories you might like

  • Boeing's Starliner capsule corroded due to high humidity levels, NASA explains, and the spaceship won't fly this year

    Meanwhile Elon's running orbital tourist trips and ISS crew missions

    Boeing’s CST-100 Starliner capsule, designed to carry astronauts to and from the International Space Station, will not fly until the first half of next year at the earliest, as the manufacturing giant continues to tackle an issue with the spacecraft’s valves.

    Things have not gone smoothly for Boeing. Its Starliner program has suffered numerous setbacks and delays. Just in August, a second unmanned test flight was scrapped after 13 of 24 valves in the spacecraft’s propulsion system jammed. In a briefing this week, Michelle Parker, chief engineer of space and launch at Boeing, shed more light on the errant components.

    Boeing believes the valves malfunctioned due to weather issues, we were told. Florida, home to NASA’s Kennedy Space Center where the Starliner is being assembled and tested, is known for hot, humid summers. Parker explained that the chemicals from the spacecraft’s oxidizer reacted with water condensation inside the valves to form nitric acid. The acidity corroded the valves, causing them to stick.

    Continue reading
  • Research finds consumer-grade IoT devices showing up... on corporate networks

    Considering the slack security of such kit, it's a perfect storm

    Increasing numbers of "non-business" Internet of Things devices are showing up inside corporate networks, Palo Alto Networks has warned, saying that smart lightbulbs and internet-connected pet feeders may not feature in organisations' threat models.

    According to Greg Day, VP and CSO EMEA of the US-based enterprise networking firm: "When you consider that the security controls in consumer IoT devices are minimal, so as not to increase the price, the lack of visibility coupled with increased remote working could lead to serious cybersecurity incidents."

    The company surveyed 1,900 IT decision-makers across 18 countries including the UK, US, Germany, the Netherlands and Australia, finding that just over three quarters (78 per cent) of them reported an increase in non-business IoT devices connected to their org's networks.

    Continue reading
  • Huawei appears to have quenched its thirst for power in favour of more efficient 5G

    Never mind the performance, man, think of the planet

    MBB Forum 2021 The "G" in 5G stands for Green, if the hours of keynotes at the Mobile Broadband Forum in Dubai are to be believed.

    Run by Huawei, the forum was a mixture of in-person event and talking heads over occasionally grainy video and kicked off with an admission by Ken Hu, rotating chairman of the Shenzhen-based electronics giant, that the adoption of 5G – with its promise of faster speeds, higher bandwidth and lower latency – was still quite low for some applications.

    Despite the dream five years ago, that the tech would link up everything, "we have not connected all things," Hu said.

    Continue reading

Biting the hand that feeds IT © 1998–2021