Apple's system to scan iCloud-bound photos on iOS devices to find illegal child sexual abuse material (CSAM) is supposed to ship in iOS 15 later this year.
However, the NeuralHash machine-learning model involved in that process appears to have been present on iOS devices at least since the December 14, 2020 release of iOS 14.3. It has been adapted to run on macOS 11.3 or later using the API in Apple's Vision framework. And thus exposed to the world, it has been probed by the curious.
In the wake of Apple's initial child safety announcement two weeks ago, several developers have explored Apple's private NeuralHash API and provided a Python script to convert the model into a convenient format – the Open Neural Network Exchange (ONNX) – for experimentation.
On Wednesday, Intel Labs research scientist Cory Cornelius used these resources to create a hash collision – two different images that, when processed by the algorithm, produce the same NeuralHash identifier.
That's expected behavior from perceptual hashing, which is designed to compute the same identifier for similar images – the idea is that one shouldn't be able to, say, convert a CSAM image from color to grayscale to evade hash-based detection.
This raised the possibility of 'poisoned' images that looked harmless, but triggered as child sexual abuse media
As Apple explains in its technical summary [PDF], "Only another image that appears nearly identical can produce the same number; for example, images that differ in size or transcoded quality will still have the same NeuralHash value."
But in this instance, the matching hashes come from completely dissimilar images – a beagle and a variegated gray square. And that finding amplifies ongoing concern that Apple's child safety technology may be abused to cause inadvertent harm. For instance, by giving someone an innocent image that is wrongly flagged up as CSAM.
Apple has said there's less than "an extremely low (1 in 1 trillion) probability of incorrectly flagging a given account," but as Matthew Green, associate professor of computer science at Johns Hopkins, observed via Twitter, Apple's statistics don't cover the possibility of "deliberately-constructed false positives."
"It was always fairly obvious that in a perceptual hash function like Apple’s, there were going to be 'collisions' — very different images that produced the same hash," said Green in reference to the collision demo. "This raised the possibility of 'poisoned' images that looked harmless, but triggered as child sexual abuse media."
- Apple says its CSAM scan code can be verified by researchers. Corellium starts throwing out dollar bills
- Apple's iPhone computer vision has the potential to preserve privacy but also break it completely
- Apple responds to critics of CSAM scan plan with FAQs, says it'd block governments subverting its system
- Apple is about to start scanning iPhone users' devices for banned content, professor warns
Jonathan Mayer, assistant professor of computer science and public affairs at Princeton University, told The Register that this does not mean that Apple's NeuralHash image matching scheme is broken.
"That would be a reasonable response if NeuralHash were a cryptographic hash function," explained Mayer. "But it's a perceptual hash function, with very different properties."
With cryptographic hash functions, he said, you're not supposed to be able to find two inputs with the same output. The formal term for that is "second-preimage resistance."
"With a perceptual hash function, by comparison, a small change to the input is supposed to produce the same output," said Mayer. "These functions are designed specifically not to have second-preimage resistance."
Mayer said while he worries the collision proof-of-concept will provoke an overreaction, he's nonetheless concerned. "There is a real security risk here," he said.
Of greatest concern, he said, is an adversarial machine-learning attack that generates images that match CSAM hashes and appear to be possible CSAM during Apple's review process. Apple, he said, can defend against these attacks and, in fact, describes some planned mitigations in its documentation.
Apple, said Mayer, "has both a technical mitigation (running a separate, undisclosed server-side perceptual hash function to check for a match) and a process mitigation (human review)," he explained. "Those mitigations have limits, and they still expose some content, but Apple has clearly thought about this issue."
"I’m less concerned about the attack than some observers, because it presupposes access to known CSAM hashes," said Mayer. "And the most direct way to get those hashes is from source images. So it presupposes an attacker committing a very serious federal felony."
Mayer's objections have more to do with the way Apple handled its child safety announcement, which even the company itself was forced to concede has led to misunderstandings.
"I find it mind boggling that Apple wasn't prepared to discuss this risk, like so many other risks surrounding its new system," said Mayer. "Apple hasn't seriously engaged with the information security community, so we're going to have a slow drip of concerning developments like this, with little context for understanding."
The Register asked Apple to comment, but we expect to hear nothing.
Apple, aware of these developments, reportedly held a call for the press in which the company downplayed the hash collision and cited safeguards like the operating system's code signing to guarantee the integrity of the NeuralHash model, human review, and redundant algorithmic check that runs server-side.
Nonetheless, AsuharietYgvar, the pseudonymous individual who made the NeuralHash model available in ONYX format, and asked to be identified as "an average concerned citizen," expressed concern that Apple was misinforming the public and skepticism about the supposed server-side check.
This is highly questionable because it adds a black box in the detection process, which no one can perform security audits on
"If their claim was true, the collision would appear to no longer be a problem since it's impossible to retrieve the algorithm they are using on the servers," said AsuharietYgvar in a message to The Register. "However, this is highly questionable because it adds a black box in the detection process, which no one can perform security audits on.
"We already know that NeuralHash is not as robust as Apple claimed. Who can believe their secret, non-audited secondary check will be better? Considering that Apple already described their NeuralHash and Private Set Intersection algorithms in detail, it's ironic that eventually they decided to keep the integral parts in secret to combat security researchers. And if I did not make their NeuralHash public, we will never know that the algorithm is that easy to defeat."
"Another real problem is that this system can be easily worked around to store CSAM materials without being detected," AsuharietYgvar continued. "Since the NeuralHash model is public now it's trivial to implement an algorithm which completely changes the hash without introducing visible difference. This will make those materials easily pass the initial on-device check.
"I believe what I did was a firm step against mass surveillance, but certainly this will not be enough. We cannot let Apple's famous 1984 ad become a reality. At least not without a fight." ®