GTC AI systems have superior abilities at recognising faces in theory, but when they're deployed in practice they often fail miserably.
Machines can analyse people's faces in a way that humans cannot. They can place landmarks outlining facial features to calculate minute details like the distances between the eyes, nose and lips. For humans, however, the power is innate and more nuanced than that.
Neural networks can be tweaked or fed more training data to improve accuracy scores. But in humans, some people are better at it than others. Facial recognition models started being deployed in 2015 when their ability to identify people in photos from datasets rose to superhuman levels.
People can recognise others with about 97.53 per cent accuracy, but some systems can reach 98 or 99 per cent nowadays, Winston Hsu, a computer science professor at the National Taiwan University, said during a talk at the GPU Technology Conference.
There are many different pre-trained architectures to choose from too, such as VGG-16 or ResNet-50. In fact, Hsu said the network structure isn't all that important, all the models work in a pretty similar way converting pixel values to numbers and crunching them using matrix operations to learn patterns in data.
Fight AI with AI! Code taught to finger naughty deepfake vids made by machine-learning algosREAD MORE
So, why don't they work well in the wild? Obviously machines don't understand anything they're looking at. Remember the embarrassing incident when Dong Mingzhu, a popular Chinese businesswoman, was accused of jaywalking when a large image of her posted on a bus floated by and was snapped by cameras? AIs are also fooled when faces are placed in various poses or under different lighting.
Hsu called this problem "large intra-class variation", when images of the same people in a particular dataset are structurally different. Here, computers can fail to recognise faces they've previously seen before. A "high inter-class similarity", where the photos are more uniform – for example, a row of headshots – are more likely to lead to false positive mismatches.
There are also biases in data. Most public datasets are racially skewed towards pale faces. A system trained on something like the MS-Celeb-1M Challenge 1 Model, for example, won't be very useful for identifying normal people on the street in countries that are more racially diverse. Hollywood celebrities don't seem to wear glasses for some reason, but a lot of people in Asia do, so something like this just wouldn't fit the specs, Hsu joked. ®