Face masks worn to reduce the spread of the COVID-19 coronavirus typically decrease the accuracy of commercial facial recognition algorithms by up to 50 per cent, according to an investigation by America's technical standards watchdog, NIST.
“With the arrival of the pandemic, we need to understand how face recognition technology deals with masked faces,” said Mei Ngan, a computer scientist at NIST who cowrote the investigation's final report. “We have begun by focusing on how an algorithm developed before the pandemic might be affected by subjects wearing face masks.
Computer-vision algorithms learn to recognize faces by picking out features, such as the distance between the eyes or the shape of a jaw. When parts of the face are covered up by face masks, it’s not surprising that machines perform worse. The drop in accuracy depends on the algorithm – some are more robust than others.
NIST probed 89 algorithms from organizations including Intel, Samsung, Acer, Panasonic, and various universities around the world. Given an image of someone wearing what NIST called a digital mask – one pasted onto the picture – each AI system was told to identify the individual by matching them to with an unmasked picture of the same person in a database. The task is described as one-to-one matching in the report [PDF].
The digital masks were generated because it proved difficult to collect a large, high-quality image dataset of people wearing and not wearing masks, so NIST decided to create fake, synthetic masks on images it had already collected.
“Using unmasked images, the most accurate algorithms fail to authenticate a person about 0.3 per cent of the time. Masked images raised even these top algorithms’ failure rate to about 5 per cent, while many otherwise competent algorithms failed between 20 per cent to 50 per cent of the time,” according to NIST.
To test the images, the institute used two datasets made up of people applying for immigrant benefits or entering the US, weirdly enough – or aptly if you consider the possible future application of this sort of tech in America:
We used these algorithms with two large datasets of photographs collected in US governmental applications that are currently in operation: unmasked application photographs from a global population of applicants for immigration benefits and digitally-masked border crossing photographs of travelers entering the United States. Both datasets were collected for authorized travel or immigration processes.
The application photos (used as reference images) have good compliance with image capture standards. The digitally-masked border crossing photos (used as probe images) are not in good compliance with image capture standards given constraints on capture duration and environment. The application photos were left unmasked, and synthetic masks were applied to the border crossing photos. This mimics an operational scenario where a person wearing a mask attempts to authenticate against a prior visa or passport photo.
Together these datasets allowed us to process a total of 6.2 million images of 1 million people through 89 algorithms.
A closer look at the results revealed that error rates depend on a number of factors, including the overall shape of the digital mask as well as its color. Machines find it more difficult to recognize faces when the mask is black and covers up the top of the nose and bottom half of the face, compared to when they’re light blue and block the mouth and jaw only.
The accuracy also varies wildly. The best performing algorithm was from DeepGlint, a computer vision and AI startup based in China, that was able to correctly identify faces obscured by digital light blue masks with high coverage just over 96 per cent of the time. Many had error rates between 20 to 50 per cent. There are a couple anomalies too, where some algorithms had error rates of near or up to 100 per cent.
NIST said it plans to examine facial-recognition algorithms that have been specifically trained to identify images of people wearing masks later this year. It’s also considering adding in the effects of patterns or multi-colored masks in future tests, we’re told. ®