Oh no, you're thinking, yet another cookie pop-up. Well, sorry, it's the law. We measure how many people read us, and ensure you see relevant ads, by storing cookies on your device. If you're cool with that, hit “Accept all Cookies”. For more info and to customize your settings, hit “Customize Settings”.

Review and manage your consent

Here's an overview of our use of cookies, similar technologies and how to manage them. You can also change your choices at any time, by hitting the “Your Consent Options” link on the site's footer.

Manage Cookie Preferences
  • These cookies are strictly necessary so that you can navigate the site as normal and use all features. Without these cookies we cannot provide you with the service that you expect.

  • These cookies are used to make advertising messages more relevant to you. They perform functions like preventing the same ad from continuously reappearing, ensuring that ads are properly displayed for advertisers, and in some cases selecting advertisements that are based on your interests.

  • These cookies collect information in aggregate form to help us understand how our websites are being used. They allow us to count visits and traffic sources so that we can measure and improve the performance of our sites. If people say no to these cookies, we do not know how many people have visited and we cannot monitor performance.

See also our Cookie policy and Privacy policy.

This article is more than 1 year old

Can you get from 'dog' to 'car' with one pixel? Japanese AI boffins can

Fooling an image classifier is surprisingly easy and suggests novel attacks

It doesn't take much to confuse AI image classifiers: a group from Japan's Kyushu University reckon you can fool them by changing the value of a single pixel in an image.

Researchers Jiawei Su, Danilo Vasconcellos Vargas and Sakurai Kouichi were working with two objectives: first, to predictably trick a Deep Neural Network (DNN), and second, to automate their attack as far as possible.

In other words, what does it take to get the AI to look at an image of an automobile, and classify it as a dog? The surprising answer: an adversarial perturbation of just one pixel would do the trick – a kind of attack that you'd be unlikely to detect with the naked eye.

As explained in a paper, the researchers came up with the startling conclusion that a one-pixel attack worked on nearly three-quarters of standard training images.

Not only that, but the boffins didn't need to know anything about the inside of the DNN – as they put it, they only needed its “black box” output of probability labels to function.

The attack was based on a technique called “differential evolution” (DE), an optimisation method which in this case identified the best target for their attack (the paper tested attacks against one, three, and five pixels).

Naturally, more pixels makes the attack more effective: a three-pixel perturbation achieved a success rate of 82 per cent, and adjusting five pixels lifted that to 87.3 per cent.

Editing that one, carefully-chosen pixel meant “each natural image can be perturbed to 2.3 other classes on average”.

The best result was an image of a dog in the training set, which the trio managed to trick the DNN into classifying as all nine of the “target” classes – airplane, automobile, bird, cat, deer, frog, horse, ship and truck.

Mis-classified images

Images from the training set, mis-classified by changing one pixel. Image: arXiv paper

The sharp-eyed, looking at the test images, will note that the single-pixel attack was carried out against images of a mere 1,024 pixels. However, on something more substantial like a 280,000 pixel image (still not so large), only 273 pixels need to be perturbed and a human observer still might not notice the changes. ®

Similar topics

TIP US OFF

Send us news


Other stories you might like