A trio of Belgium-based boffins have created a ward that renders wearers unrecognizable to software trained to detect people.
In a research paper distributed through ArXiv in advance of its presentation at computer vision workshop CV-COPS 2019, Simen Thys, Wiebe Van Ranst and Toon Goedeme from KU Leuven describe how some colorful abstract signage can defend against Darknet, an open source neural network framework that supports You Only Look Once (YOLO) object detection.
The paper is titled "Fooling automated surveillance cameras: adversarial patches to attack person detection."
Adversarial images that dupe machine learning systems have been the subject of considerable research in recent years. While there have been many examples of specially crafted objects that trip up computer vision systems, like stickers that can render stop signs unrecognizable, the KU Leuven boffins contend no previous work has explored adversarial images that mask a class of things as diverse as people.
"The idea behind this work is to be able to circumvent security systems that use a person detector to generate an alarm when a person enters the view of a camera," explained Wiebe Van Ranst, a PhD researcher at KU Leuven, in an email to The Register. "Our idea is to generate an occlusion pattern that can be worn by a possible intruder to conceal the intruder from for the detector."
What makes the work challenging, he said, is how varied people are in the way they appear, with different clothing, poses, and so on.
The researchers targeted the popular YOLOv2 convolutional neural network by feeding it their dataset of images to return bounding boxes that outline people identified by the detection algorithm.
"On a fixed position relative to these bounding boxes, we then apply the current version of our patch to the image under different transformations," they explain in their paper.
"The resulting image is then fed (in a batch together with other images) into the detector. We measure the score of the persons that are still detected, which we use to calculate a loss function. Using back propagation over the entire network, the optimiser then changes the pixels in the patch further in order to fool the detector even more."
Van Ranst said having access to footage from a surveillance camera can be used to train a more reliable patch. "However, this is not strictly necessary, we can also use an existing database of images as training data (as we do in the paper)," he said.
"In later experiments we did however notice that our current technique can be quite sensitive to the dataset our detector was trained on. Making it more robust to these cases is something we would like to investigate in the future."
The result of this process, a colorful patch that's 40cm (~15 inch) square, is just a bit larger than the cardboard sleeve of a vinyl record or a glossy magazine. It has been formulated to throw off the YOLOv2 software's ability to identify people.
The researcher's work can be seen in this YouTube video.
"In most cases our patch is able to successfully hide the person from the detector," the researchers explain in their paper. "Where this is not the case, the patch is not aligned to the center of the person."
Looking ahead, the researchers hope to generalize their work to other neural network architectures like Faster R-CNN. They believe that they will be able turn their pattern into a T-shirt print that will make people "virtually invisible" to object-detection algorithms in automatic surveillance cameras.
Presently, however, the pattern needs to be directly visible to the camera being fooled. According to Van Ranst, further work needs to be done to make the pattern functional when viewed at an angle. ®