Boffins in Germany have devised a technique to subvert neural network frameworks so they misidentify images without any telltale signs of tampering.
Erwin Quiring, David Klein, Daniel Arp, Martin Johns, and Konrad Rieck, computer scientists at TU Braunschweig, describe their attack in a pair of papers, slated for presentation at technical conferences in May and in August this year – events that may or may not take place given the COVID-19 global health crisis.
The papers, titled "Adversarial Preprocessing: Understanding and Preventing Image-Scaling Attacks in Machine Learning" [PDF] and "Backdooring and Poisoning Neural Networks with Image-Scaling Attacks [PDF]," explore how the preprocessing phase involved in machine learning presents an opportunity to fiddle with neural network training in a way that isn't easily detected. The idea being: secretly poison the training data so that the software later makes bad decisions and predictions.
This example image, provided by the academics, of a cat has been modified so that when downscaled by an AI framework for training, it turns into a dog, thus muddying the training dataset
There have been numerous research projects that have demonstrated that neural networks can be manipulated to return incorrect results, but the researchers say such interventions can be spotted at training or test time through auditing.
"Our findings show that an adversary can significantly conceal image manipulations of current backdoor attacks and clean-label attacks without an impact on their overall attack success rate," explained Quiring and Rieck in the Backdooring paper. "Moreover, we demonstrate that defenses – designed to detect image scaling attacks – fail in the poisoning scenario."
Their key insight is that algorithms used by AI frameworks for image scaling – a common preprocessing step to resize images in a dataset so they all have the same dimensions – do not treat every pixel equally. Instead, these algorithms, in the imaging libraries of Caffe's OpenCV, TensorFlow's tf.image, and PyTorch's Pillow, specifically, consider only a third of the pixels to compute scaling.
"This imbalanced influence of the source pixels provides a perfect ground for image-scaling attacks," the academics explained. "The adversary only needs to modify those pixels with high weights to control the scaling and can leave the rest of the image untouched."
Fool ML once, shame on you. Fool ML twice, shame on... the AI dev? If you can hoodwink one model, you may be able to trick many moreREAD MORE
On their explanatory website, the eggheads show how they were able to modify a source image of a cat, without any visible sign of alteration, to make TensorFlow's nearest scaling algorithm output a dog.
This sort of poisoning attack during the training of machine learning systems can result in unexpected output and incorrect classifier labels. Adversarial examples can have a similar effect, the researchers say, but these work against one machine learning model.
Image scaling attacks "are model-independent and do not depend on knowledge of the learning model, features or training data," the researchers explained. "The attacks are effective even if neural networks were robust against adversarial examples, as the downscaling can create a perfect image of the target class."
The attack has implications for facial recognition systems in that it could allow a person to be identified as someone else. It could also be used to meddle with machine learning classifiers such that a neural network in a self-driving car could be made to see an arbitrary object as something else, like a stop sign.
To mitigate the risk of such attacks, the boffins say the area scaling capability implemented in many scaling libraries can help, as can Pillow's scaling algorithms (so long as it's not Pillow's nearest scaling scheme). They also discuss a defense technique that involves image reconstruction.
The researchers plan to publish their code and data set on May 1, 2020. They say their work shows the need for more robust defenses against image-scaling attacks and they observe that other types of data that get scaled like audio and video may be vulnerable to similar manipulation in the context of machine learning. ®