AI + ML

This article is more than 1 year old

Revealed: How Nvidia's 'backseat driver' AI learned to read lips

Driving assistant gives self-drivers a bit of Lip(Net)

Tue 17 Jan 2017 // 10:02 UTC

When Nvidia popped the bonnet on its Co-Pilot "backseat driver" AI at this year’s Consumer Electronics Show, most onlookers were struck by its ability to lip-read while tracking CES-going "motorists'" actions within the "car".

A slide taken at CES shows the Co-Pilot AI assistant performing four features: facial recognition, head tracking, gaze tracking and lip-reading.

The @nvidia AI co-pilot analyzes you through face recognition, head and gaze tracking and lip reading to assist you. #CES2017 pic.twitter.com/sD2N4Kkinr
— CES (@CES) January 5, 2017

The automative AI is part of the GPU-flinger's DRIVE PX 2 platform, which uses sensors and multiple neural networks powered by the grunt of Nvidia's processors.

An Nvidia spokesperson has since confirmed in an email to The Register that the lip-reading component was based on research paper [PDF] written by academics from the University of Oxford, Google DeepMind and the Canadian Institute for Advanced Research.

"We are really happy to see LipNet in such an application and is the proof that our novel architecture is scalable to real-world problems," the research team added in an email to El Reg.

"Machine lip readers have enormous practical potential, with applications in speech recognition in noisy environments such as cars, improved hearing aids, silent dictation in public spaces (Siri will never have to hear your voice again), covert conversations, biometric identification, and silent-movie processing."

The paper was initially criticised. Although the neural network, LipNet, had an impressive accuracy rate of 93.4 per cent, it was only tested on a limited dataset of words and not coherent sentences. We're told LipNet was later retrained using a dataset of the 22 drivers to improve it.

"Since it is ongoing research we cannot disclose error rates," the LipNet team said of the retrained model. "But we can say that after less than a day of training, the performance was as good as expected."

Increasing the amount of useful training data improves AI models. For example, a second paper, unofficially published on arXiv by another team at Oxford, demonstrated a better AI-based lip-reading system is possible. It could decipher complete sentences after it had been trained to watch the speech movements of BBC News presenters for several hours.

Nvidia’s Co-Pilot assistant shows LipNet has progressed further to pick up the spoken commands of drivers so it can process instructions such as choosing a song to play, even when loud music is already thumping in the background.

The head- and gaze-tracking and facial recognition capabilities were developed to provide better security and a safer driving experience, said Nvidia.

“[There is] an AI for face recognition, so the car knows who you are, setting personal preferences and eliminating the need for a key. An AI for gaze detection, so your car knows if you’re paying attention,” Nvidia wrote in a blog post.

Nvidia is mostly known for designing powerful GPUs for gaming and HPC but has lately been putting more of its efforts towards GPU-accelerated machine learning and AI.

Mercedes, Audi, Tesla and Toyota are current customers of the new technology, an Nvidia spokesperson confirmed to The Register. ®

Topics

Special Features

Vendor Voice

Resources

AI + ML

Revealed: How Nvidia's 'backseat driver' AI learned to read lips

Driving assistant gives self-drivers a bit of Lip(Net)

More about

More about

Narrower topics

More about

More about

More about

Narrower topics

TIP US OFF

Other stories you might like

Intel Gaudi's third and final hurrah is an AI accelerator built to best Nvidia's H100

Lambda borrows half a billion bucks to grow its GPU cloud

Next Vision, or Vision Next? What we really thought about Google and Intel's AI events

A different view from the edge

Industrial robots make people feel worse about jobs and themselves

What Nvidia's Blackwell efficiency gains mean for DC operators

Intel courts devs with open arms and exotic hardware

TSMC boss says one-trillion transistor GPU is possible by early 2030s

AI bubble or not, Nvidia is betting everything on a GPU-accelerated future

Overclocking muddies waters for Nvidia's redesigned RTX 4090 and US sanctions

Nvidia turns up the AI heat with 1,200W Blackwell GPUs

Nvidia's newborn ChatRTX bot patched for security bugs

About Us

Our Websites

Your Privacy