This article is more than 1 year old

The eyes have it: 'DeepFakes' bogus AI-meddled videos outed by unblinking gaze

Phony vids generated by machine-learning models can be detected

In the last year or so convincing fake videos known as DeepFakes – the product of deep learning-driven facial image manipulation – have been condemned as a threat to democracy, or what's left of it.

The fear is that invented events represent the sort of fake news that can alter elections and affect civic engagement. Imagine the havoc that could be caused by a video of a prominent politician repudiating democratic norms – and no one is sure whether it reflects reality.

A trio researchers from the University at Albany, State University of New York, believe they have an answer, at least given the current state of video forging tech: measuring how often people depicted in videos blink.

In an academic paper titled "In Ictu Oculi: Exposing AI Generated Fake Face Videos by Detecting Eye Blinking," released recently through preprint server ArXiv, Yuezun Li, Ming-Ching Chang and Siwei Lyu describe an approach for detecting inauthentic videos.

On average, the paper explains, people blink about 17 times a minute or 0.283 times per second, a rate that increases with conversation and decreases while reading.

Current DeepFake software fails to take this into account.

"AI generated faces lack eye blinking function, as most training datasets do not contain faces with eyes closed," the paper says. "The lack of eye blinking is thus a telltale sign of a video coming from a different source than a video recorder."

It turns out quite a bit of work has been done developing techniques for eye blink detection. These may involve computing the vertical distance between eyelids to infer eye state, measuring the eye aspect ratio (confusingly acronymized as EAR), and using convolutional neural network (CNN) classifiers to detect open and closed states.


New York State is trying to ban 'deepfakes' and Hollywood isn't happy


Li, Chang and Lyu rely on a Long-term Recurrent Convolutional Networks (LRCN) model for assessing eye state. After some preprocessing to identify facial features and normalize the video frame orientation, they pass cropped eye images into the LRCN for evaluation.

Their technique outperforms other approaches, with reported accuracy of 0.99 (LRCN) compared to 0.98 (CNN) and 0.79 (EAR). What gives LRCN the edge over CNN is that the latter doesn't take the eye image's past state into account in its calculations.

The researchers contend that their approach to blink measurement shows promise as a means to detect fake videos. But they acknowledge that skilled video forgers may be able to create more realistic blinking through post-processing, better models and more training data.

In the long run, they suggest, other types of physiological signals will also need to be considered in efforts to detect fake videos. ®

More about


Send us news

Other stories you might like