Boffins don bad 1980s fashion to avoid being detected by object-recognizing AI cameras

Adversarial T-shirt disguise hoodwinks machine-learning algorithms


In a pleasing symmetry, boffins have used machine-learning algorithms to develop a T-shirt design that causes its wearer to evade detection by object-recognition cameras.

Brainiacs at Northeastern University, MIT, and IBM Research in the US teamed up to create the 1980s-esque fashion statement, according to a paper quietly emitted via arXiv in mid-October. Essentially, the aim is to create a T-shirt that fools AI software into not detecting and classifying the wearer as a person. This means the wearer could slip past visitor or intruder detection systems, and so on.

The T-shirt design is a classic adversarial example. That means the pattern on the shirt has been carefully designed to manipulate just the right parts of a detection system's neural network to make it misidentify the wearer. Previous adversarial experiments have typically involved flat or rigid 2D or 3D images or objects, like stickers or toy turtles.

Now, this team, at least, has shown it’s possible to trick computer-vision models with more flexible materials like T-shirts, too.

“We highlight that the proposed adversarial T-shirt is not just a T-shirt with printed adversarial patch for clothing fashion, it is a physical adversarial wearable designed for evading person detectors in a real world,” the paper said.

In this case, the adversarial T-shirt helped a person evade detection. The two convolutional neural networks tested, YOLOv2 and Faster R-CNN, have been trained to identify objects. Under normal circumstances, when it’s given a photo containing people it should be able to draw a bounding box around them, labeling them as “person”.

But you can trick the system and avoid being noticed at all by wearing the adversarial T-shirt. “Our method aims to make the person's bounding box vanish rather than misclassify the person as a wrong label,” an IBM spokesperson told The Register.

adversarial_t_shirt

The person walking on the left is wearing the adversarial T-shirt with a print distorted using TPS, and is mostly ignored by the object-recognizing camera ... The attack isn't quite perfect, however, as the disguise fails and the wearer is briefly identified as a person by the camera in the fourth frame as you can see by the bounding box (Image credit: Xu et al)

Close up of the AI-fooling T-shirt

A close-up of the camera-fooling T-shirt ... Click to enlarge

Real attacks on the physical world are much harder

Crafting adversarial examples from non-rigid materials like T-shirts is tricky. The soft fabric tends to wrinkle as a person moves, causing the adversarial print to be distorted. So the researchers employed a technique known as “thin plate spline (TPS) based transformer”.

TPS learns to map the deformations caused by bodily movement so that although the pixels of the adversarial print have been warped, the T-shirt can still fool the detector. The results aren’t perfect, however. The adversarial T-shirt fashioned with TPS managed to successfully evade detection from the Faster R-CNN model 52 per cent of the time and from the YOLOv2 model 63 per cent of the time.

The success rate drastically drops when TPS isn’t used. If the adversarial patch is just printed onto the T-shirt, it goes down to just 11 per cent for Faster CNN and 27 per cent for YOLOv2.

In order to work out how best to distort the pixels for the TPS technique, the researchers filmed someone wearing a T-shirt with a checkerboard print to see how it deformed as they moved. A series of 30 videos, each lasting somewhere between five to ten seconds, with someone wearing the checkerboard T-shirt walking towards a camera on a smartphone were recorded and used for training.

adversarial_t_shirt_training

An example of some of the footage recorded for the training dataset. Image credit: Xu et al.

facial_recognition

The Feds are building an America-wide face surveillance system – and we're going to court to prove it, says ACLU

READ MORE

After TPS is applied to the adversarial patch and printed on a T-shirt, the researchers recorded another 10 videos of someone wearing the newly crafted apparel and fed it into the Faster R-CNN and YOLOv2 models.

It’s important to note that although the paper describes the attack as “real time,” it’s actually performed using pre-recorded videos rather than someone walking past a camera in real time. “We did not mean the process of generating adversarial perturbation is real-time. It meant the detector that we attacked is real-time, such as, YOLOv2,” the spokesperson said.

So, before you get your hopes up this method might not help you evade detection as you physically walk past object recognition cameras. The real world is messy, and other factors like low resolution, any other objects in the background, or even the way you move can affect the detection process. By feeding the models pre-recorded footage, the researchers can avoid some of those difficulties.

Nevertheless, they hope to continue studying adversarial examples using “human clothing, accessories, paint on face, and other wearables.” ®

Broader topics


Other stories you might like

  • Microsoft promises to tighten access to AI it now deems too risky for some devs
    Deep-fake voices, face recognition, emotion, age and gender prediction ... A toolbox of theoretical tech tyranny

    Microsoft has pledged to clamp down on access to AI tools designed to predict emotions, gender, and age from images, and will restrict the usage of its facial recognition and generative audio models in Azure.

    The Windows giant made the promise on Tuesday while also sharing its so-called Responsible AI Standard, a document [PDF] in which the US corporation vowed to minimize any harm inflicted by its machine-learning software. This pledge included assurances that the biz will assess the impact of its technologies, document models' data and capabilities, and enforce stricter use guidelines.

    This is needed because – and let's just check the notes here – there are apparently not enough laws yet regulating machine-learning technology use. Thus, in the absence of this legislation, Microsoft will just have to force itself to do the right thing.

    Continue reading
  • Is computer vision the cure for school shootings? Likely not
    Gun-detecting AI outfits want to help while root causes need tackling

    Comment More than 250 mass shootings have occurred in the US so far this year, and AI advocates think they have the solution. Not gun control, but better tech, unsurprisingly.

    Machine-learning biz Kogniz announced on Tuesday it was adding a ready-to-deploy gun detection model to its computer-vision platform. The system, we're told, can detect guns seen by security cameras and send notifications to those at risk, notifying police, locking down buildings, and performing other security tasks. 

    In addition to spotting firearms, Kogniz uses its other computer-vision modules to notice unusual behavior, such as children sprinting down hallways or someone climbing in through a window, which could indicate an active shooter.

    Continue reading
  • Cerebras sets record for 'largest AI model' on a single chip
    Plus: Yandex releases 100-billion-parameter language model for free, and more

    In brief US hardware startup Cerebras claims to have trained the largest AI model on a single device powered by the world's largest Wafer Scale Engine 2 chip the size of a plate.

    "Using the Cerebras Software Platform (CSoft), our customers can easily train state-of-the-art GPT language models (such as GPT-3 and GPT-J) with up to 20 billion parameters on a single CS-2 system," the company claimed this week. "Running on a single CS-2, these models take minutes to set up and users can quickly move between models with just a few keystrokes."

    The CS-2 packs a whopping 850,000 cores, and has 40GB of on-chip memory capable of reaching 20 PB/sec memory bandwidth. The specs on other types of AI accelerators and GPUs pale in comparison, meaning machine learning engineers have to train huge AI models with billions of parameters across more servers.

    Continue reading

Biting the hand that feeds IT © 1998–2022