Don't believe the hype that AI-generated 'master faces' can break into face recognition systems any time soon

The machine learning model was trained and tested on limited data


Analysis The idea of so-called “master faces,” a set of fake images generated by machine learning algorithms to crack into facial biometric systems by impersonating people, made splashy headlines last week. But a closer look at the research reveals clear weaknesses that make it unlikely to work in the real world.

“A master face is a face image that passes face-based identity-authentication for a large portion of the population,” the paper, released on arXiv earlier this month, explained. “These faces can be used to impersonate, with a high probability of success, any user, without having access to any user-information.”

The trio of academics from Tel Aviv University go on to say they built a model that generated nine master faces capable of representing 40 per cent of the population that bypassed “three leading deep face recognition systems.” At first glance, it seems impressive and the claims pose clear security risks in applications that require facial identification.

First, the team employed Nvidia’s StyleGAN system to create realistic-looking images of made-up faces. Each fake output was compared to one real photograph of the 5,749 different people represented in the Labeled Faces in the Wild (LFW) dataset. A separate classifier algorithm determines how similar the fake AI-generated faces look compared to the real ones in the dataset.

Images that score highly for similarity by the classifier are kept, and the others are discarded. These scores are used to train an evolutionary algorithm to create more and more spoof faces using StyleGAN that look like the people in the dataset.

Over time, the researchers are able to find a set of master faces that represent as many of the images they can in the dataset. In short, they were able to come up with just nine images to represent 40 per cent of the 5,749 different people in the Labeled Faces in the Wild dataset.

Next, they used these master faces to spoof three face different facial recognition models: Dlib, FaceNet, and SphereFace. These systems ranked most highly in the contest that benchmarks the best face matching algorithms tested on the LFW dataset.

A quick look at the highest-scoring master faces capable of bypassing each of the three models, however, shows a clear limitation in the research. They’re pretty much all fake images of older Caucasian men, donning white hair, glasses, and mustaches. If these same types of images are able to represent a large population of the LFW dataset then surely the dataset must be somewhat flawed.

master_faces_results_1

The best master face that was able to trick Dlib (left), FaceNet (middle), and SphereFace (right). Taken from Figure 4 in the paper.

Garbage in, garbage out

A disclaimer posted on the website hosting the dataset confirms this: “Many groups are not well represented in LFW. For example, there are very few children, no babies, very few people over the age of 80, and a relatively small proportion of women. In addition, many ethnicities have very minor representation or none at all.”

The scores of the nine master faces reflect the limitations of the LFW dataset. Faces that are female, darker in skin tone, and younger are ranked lower and less likely to bypass the three models that were tested.

master_faces_results_2

The nine master faces that represent 40 per cent of the LFW dataset. Notice how the scores are lower for people who are younger, female, or have darker skin tones. Taken from Figure 5 of the paper.

“While theoretically LFW could be used to assess performance for certain subgroups, the database was not designed to have enough data for strong statistical conclusions about subgroups. Simply put, LFW is not large enough to provide evidence that a particular piece of software has been thoroughly tested,” according to another disclaimer listed on the LFW’s website.

Although the idea of master faces capable of impersonating a vast proportion of peoples faces to unlock face recognition systems is interesting, the research here is just another case of a machine learning model trained and tested using flawed data. Garbage in, garbage out, as they say.

There is a lack of diversity in the LFW dataset, so the computer-generated master faces are more likely to cover a larger proportion of that dataset. It’s unlikely that these images would work as well in the real world.

And no real-world tests

“LFW indeed suffers from the limitations described in its official website, but in spite of these limitations, LFW is a widely used dataset in the academic literature for evaluating face recognition methods,” Tomer Friedlander, co-author of the paper and a researcher at the School of Electrical Engineering at Tel Aviv University, told The Register.

“Our paper presents a possible vulnerability of face recognition systems, which can be exploited by attackers. Therefore, it should be taken into consideration by both developers and users of face recognition methods. We have not tested our method against commercial face recognition systems, which are used in real life, so we cannot refer to systems in real life.”

It’s possible to adapt the model to better datasets that are more diverse to try and trick systems in the real world, he said. “We are interested in further exploring the possibility of using the master faces generated by our method in order to help protect existing facial recognition systems from such attacks. We leave this for future research.”

Don’t fall for the scaremongering headlines claiming these master faces can break into “over 40 per cent of facial ID authentication systems” or that they’re “wildly successful”. There’s little evidence to support those claims.

Friedlander told us the paper has been accepted into this year’s IEEE International Conference on Automatic Face & Gesture Recognition conference to be held in December. ®

Similar topics


Other stories you might like

  • Research finds consumer-grade IoT devices showing up... on corporate networks

    Considering the slack security of such kit, it's a perfect storm

    Increasing numbers of "non-business" Internet of Things devices are showing up inside corporate networks, Palo Alto Networks has warned, saying that smart lightbulbs and internet-connected pet feeders may not feature in organisations' threat models.

    According to Greg Day, VP and CSO EMEA of the US-based enterprise networking firm: "When you consider that the security controls in consumer IoT devices are minimal, so as not to increase the price, the lack of visibility coupled with increased remote working could lead to serious cybersecurity incidents."

    The company surveyed 1,900 IT decision-makers across 18 countries including the UK, US, Germany, the Netherlands and Australia, finding that just over three quarters (78 per cent) of them reported an increase in non-business IoT devices connected to their org's networks.

    Continue reading
  • Huawei appears to have quenched its thirst for power in favour of more efficient 5G

    Never mind the performance, man, think of the planet

    MBB Forum 2021 The "G" in 5G stands for Green, if the hours of keynotes at the Mobile Broadband Forum in Dubai are to be believed.

    Run by Huawei, the forum was a mixture of in-person event and talking heads over occasionally grainy video and kicked off with an admission by Ken Hu, rotating chairman of the Shenzhen-based electronics giant, that the adoption of 5G – with its promise of faster speeds, higher bandwidth and lower latency – was still quite low for some applications.

    Despite the dream five years ago, that the tech would link up everything, "we have not connected all things," Hu said.

    Continue reading
  • What is self-learning AI and how does it tackle ransomware?

    Darktrace: Why you need defence that operates at machine speed

    Sponsored There used to be two certainties in life - death and taxes - but thanks to online crooks around the world, there's a third: ransomware. This attack mechanism continues to gain traction because of its phenomenal success. Despite admonishments from governments, victims continue to pay up using low-friction cryptocurrency channels, emboldening criminal groups even further.

    Darktrace, the AI-powered security company that went public this spring, aims to stop the spread of ransomware by preventing its customers from becoming victims at all. To do that, they need a defence mechanism that operates at machine speed, explains its director of threat hunting Max Heinemeyer.

    According to Darktrace's 2021 Ransomware Threat Report [PDF], ransomware attacks are on the rise. It warns that businesses will experience these attacks every 11 seconds in 2021, up from 40 seconds in 2016.

    Continue reading

Biting the hand that feeds IT © 1998–2021