Dear rioters: Hiding your face with scarves, hats can't fool this AI system

Accuracy is not great – but it's a start for computer-aided crackdowns by cops

Software can take a decent stab at identifying looters, rioters and anyone else who hides their faces with scarves, hats, and glasses, a study has shown.

A paper by a team of researchers at the University of Cambridge, UK, the National Institute of Technology in Warangal, India, and the Indian Institute of Science, describes how a convolutional neural network can be trained for so-called disguised face identification (DFI).

Amarjot Singh, based at Cambridge's department of engineering, told The Register on Tuesday that DFI is of “great interest to law enforcement as they can use this technology to identify criminals. This is the primary reason we attempted to solve this problem.”

For the study, a convolutional neural network was trained by showing it hundreds of images of people who had concealed their mugs using bits of clothing from sunglasses to beards and helmets. For each snap, 14 points were identified: ten markers to memorize parts of the eyebrow and eye regions, one for the nose, and three for the lips.

All the points connect to create what the researchers call a “star-net structure.” The distance and angles between the points on the network is analyzed to learn the structure of people's faces even when obscured by shades or balaclavas, and so on.

The crucial thing to understand here is that the system is trained to identify these markers from the pixels of any obscured face. It is taught how to take a masked head and create from it a fingerprint, if you will, of the structure of a person's face out of these 14 points.

These points, this facial fingerprint, can then be used to scan a larger database – a database of pictures the DFI machine-learning system hasn't seen before – such as a library of driving license photos or mugshots of known troublemakers – and identify potential matches.

In other words, shown a cropped still from CCTV footage of a riot, the program would intuitively place the 14 markers. That pattern of markers would then be used to match a face in a database. And bingo, your miscreant is unmasked.

That's how it would work in theory. This thing is still a proof of concept.

Images of five different participants. The top row is a clean input image. The second row has the fourteen points mapped out. The third row shows the star-net structure ... Click to enlarge (Photo credit: Singh et al)

A thousand images of people in various disguises were used for training, and 500 were held back for validation and another 500 for testing – the difference between training, validation and testing datasets is explained here.

Although Singh hopes it will be helpful for police trying to finger crooks and looters, it has sparked some fears of authoritarian regimes using it to digitally unmask innocent folks and incriminate legitimate protesters and activists. Singh said that while he agreed that there is a concern regarding the violation of privacy and right to assemble, the technology could be used to “take many criminals off the streets as well.”

“As to make sure that this doesn’t fall into the wrong hands, we will just have to be aware that this technology is available only to the organizations that intend to use it for the good,” he said.

AI is still easily fooled

It’s important to realize that while the accuracy of this computer vision system is good for early research, on a practical level it isn’t that awesome. Cluttering up the background with buildings and objects is enough to lower its precision from 85 per cent to 56 per cent.

The more the face is obscured, the more difficult it is to recognize. A combination of a hat, scarf and glasses can lower the accuracy to 43 per cent. All the training images also feature clear shots of people facing the camera, something that is not often the case in fuzzy stills from CCTV footage. A lot of that material will feature people standing at various angles to the cameras. Then there's the fact that people wearing full-cover masks – think V for Vendetta – will fool it all the time.

Table of the system's accuracy rates

The training dataset is also pretty limited since it’s expensive to pay participants to dress up in various items and poses and have their photos taken so that their faces can be mapped out. Each person was paid about ten dollars for a 30-minute session, Singh said.

“The current dataset has 10 disguises and is composed of primarily Indian and some Caucasian people. This needs to be expanded to more disguises with people from other ethnicities as well to increase the effectiveness of the DFI system,” he added. In other words, you're going to be out of luck if your protesters aren't white or Indian.

So far this is a lab system. It needs thousands upon thousands more pictures to train it so it can accurately identify facial markers from a range of different types of human, and not be and not be​ ​misled by ​objects bearing only a distant similarity to a person's face.

This will require more efficient algorithms and code to achieve at scale. It isn't there yet.

The researchers will present their work – unearthed this week by Jack Clark's Import AI newsletter – next month at the IEEE International Conference on Computer Vision Workshop in Venice, Italy. ®

Similar topics

Broader topics

Other stories you might like

  • FTC signals crackdown on ed-tech harvesting kid's data
    Trade watchdog, and President, reminds that COPPA can ban ya

    The US Federal Trade Commission on Thursday said it intends to take action against educational technology companies that unlawfully collect data from children using online educational services.

    In a policy statement, the agency said, "Children should not have to needlessly hand over their data and forfeit their privacy in order to do their schoolwork or participate in remote learning, especially given the wide and increasing adoption of ed tech tools."

    The agency says it will scrutinize educational service providers to ensure that they are meeting their legal obligations under COPPA, the Children's Online Privacy Protection Act.

    Continue reading
  • Mysterious firm seeks to buy majority stake in Arm China
    Chinese joint venture's ousted CEO tries to hang on - who will get control?

    The saga surrounding Arm's joint venture in China just took another intriguing turn: a mysterious firm named Lotcap Group claims it has signed a letter of intent to buy a 51 percent stake in Arm China from existing investors in the country.

    In a Chinese-language press release posted Wednesday, Lotcap said it has formed a subsidiary, Lotcap Fund, to buy a majority stake in the joint venture. However, reporting by one newspaper suggested that the investment firm still needs the approval of one significant investor to gain 51 percent control of Arm China.

    The development comes a couple of weeks after Arm China said that its former CEO, Allen Wu, was refusing once again to step down from his position, despite the company's board voting in late April to replace Wu with two co-chief executives. SoftBank Group, which owns 49 percent of the Chinese venture, has been trying to unentangle Arm China from Wu as the Japanese tech investment giant plans for an initial public offering of the British parent company.

    Continue reading
  • SmartNICs power the cloud, are enterprise datacenters next?
    High pricing, lack of software make smartNICs a tough sell, despite offload potential

    SmartNICs have the potential to accelerate enterprise workloads, but don't expect to see them bring hyperscale-class efficiency to most datacenters anytime soon, ZK Research's Zeus Kerravala told The Register.

    SmartNICs are widely deployed in cloud and hyperscale datacenters as a means to offload input/output (I/O) intensive network, security, and storage operations from the CPU, freeing it up to run revenue generating tenant workloads. Some more advanced chips even offload the hypervisor to further separate the infrastructure management layer from the rest of the server.

    Despite relative success in the cloud and a flurry of innovation from the still-limited vendor SmartNIC ecosystem, including Mellanox (Nvidia), Intel, Marvell, and Xilinx (AMD), Kerravala argues that the use cases for enterprise datacenters are unlikely to resemble those of the major hyperscalers, at least in the near term.

    Continue reading

Biting the hand that feeds IT © 1998–2022