Black Hat Asia Computer vision systems display “correlation bias” that makes it possible to create adversarial images, that could have real-world consequences such as messing with self-driving cars’ ability to accurately interpret road signs.
That assertion was made today at the Black Hat Asia conference by Paul Ziegler, CEO of risk management software vendor Reflare, and Yin Minn Pa Pa, a senior researcher and manager at Deloitte Tohmatsu Cyber LLC and Deloitte Japan. Masaki Kamizono, CTO and partner at Deloitte Tohmatsu Cyber LLC and Deloitte Japan also worked on the research.
In a talk titled “Hiding Objects From Computer Vision By Exploiting Correlation Biases” the Ziegler and Pa Pa explained that they used computer vision systems, including those offered online by Microsoft and Google, to survey images from the Common Objects In Context (COCO).
In their speech, the pair said that one way computer vision systems recognise objects is to consider them in context, and said their work has detected odd outcomes resulting from that practice. An image of a plant on a t-shirt is hardly ever identified as a plant because algorithms focus on people and don’t expect a plant to be on your abdomen. Anything round pictured near a dog, they explained, is often identified as a frisbee because computer vision systems corelate dogs and flying disc toys.
- Can your AI code be fooled by vandalized images or clever wording? Microsoft open sources a tool to test for that
- How to hide a backdoor in AI software – such as a bank app depositing checks or a security cam checking faces
- Microsoft rolls out mask detection to Azure Cognitive Services. And yes, there is a noseAndMouthCovered attribute
- Machine-learning model creates creepiest Doctor Who images yet – by scanning the brain of a super fan
The team therefore started to create composite images and test them with the YOLOv3 object detection system to see if they could effectively hide things by placing them next to objects that computer vision systems have been trained to see as unlikely correlations.
That approach saw computer vision systems suggest that mashup images of dogs and cats depicted a horse.
Another interesting result came with images of STOP signs and fruit. Computer vision systems spotted the fruit, but could not identify the STOP sign. The researchers created a STOP sign, photographed it against odd backgrounds, and succeeded in making computer vision systems fail to detect it.
In their talk, the pair asserted this bias could become an attack. They posited a self-driving car that augments its use of real-time sensors with computer vision being confronted by a STOP sign that had deliberately been located among out-of-place objects, or had a poster attached behind it. Ziegler suggested that creating composite images could be a way to evade content filters designed to prevent upload of certain images, or that an unusual combination of goods could confuse systems at unstaffed retailers like Amazon Go.
The team’s work was inquisitive, not adversarial, so they’ve not tested attacks.
Nor have they tested whether the potential for attacks will persist, because their tests were conducted in March 2021 and computer vision algorithms are constantly enhanced.
But Ziegler did say this work suggests that efforts to evade computer vision systems are feasible. ®
The Register asked the researchers if their work had found that with images depicting both pineapples and pizza, the computer vision systems failed to detect one or the other, in the hope of securing proof that the combination is inappropriate. Ziegler opined that any test would surely have crashed their test systems with an “out of morals panic”, but that as pineapples are not a class in COCO, a definitive answer to the question was not possible.