Revealed: How Nvidia's 'backseat driver' AI learned to read lips

Driving assistant gives self-drivers a bit of Lip(Net)

When Nvidia popped the bonnet on its Co-Pilot "backseat driver" AI at this year’s Consumer Electronics Show, most onlookers were struck by its ability to lip-read while tracking CES-going "motorists'" actions within the "car".

A slide taken at CES shows the Co-Pilot AI assistant performing four features: facial recognition, head tracking, gaze tracking and lip-reading.

The automative AI is part of the GPU-flinger's DRIVE PX 2 platform, which uses sensors and multiple neural networks powered by the grunt of Nvidia's processors.

An Nvidia spokesperson has since confirmed in an email to The Register that the lip-reading component was based on research paper [PDF] written by academics from the University of Oxford, Google DeepMind and the Canadian Institute for Advanced Research.

"We are really happy to see LipNet in such an application and is the proof that our novel architecture is scalable to real-world problems," the research team added in an email to El Reg.

"Machine lip readers have enormous practical potential, with applications in speech recognition in noisy environments such as cars, improved hearing aids, silent dictation in public spaces (Siri will never have to hear your voice again), covert conversations, biometric identification, and silent-movie processing."

The paper was initially criticised. Although the neural network, LipNet, had an impressive accuracy rate of 93.4 per cent, it was only tested on a limited dataset of words and not coherent sentences. We're told LipNet was later retrained using a dataset of the 22 drivers to improve it.

"Since it is ongoing research we cannot disclose error rates," the LipNet team said of the retrained model. "But we can say that after less than a day of training, the performance was as good as expected."

Increasing the amount of useful training data improves AI models. For example, a second paper, unofficially published on arXiv by another team at Oxford, demonstrated a better AI-based lip-reading system is possible. It could decipher complete sentences after it had been trained to watch the speech movements of BBC News presenters for several hours.

Nvidia’s Co-Pilot assistant shows LipNet has progressed further to pick up the spoken commands of drivers so it can process instructions such as choosing a song to play, even when loud music is already thumping in the background.

The head- and gaze-tracking and facial recognition capabilities were developed to provide better security and a safer driving experience, said Nvidia.

“[There is] an AI for face recognition, so the car knows who you are, setting personal preferences and eliminating the need for a key. An AI for gaze detection, so your car knows if you’re paying attention,” Nvidia wrote in a blog post.

Nvidia is mostly known for designing powerful GPUs for gaming and HPC but has lately been putting more of its efforts towards GPU-accelerated machine learning and AI.

Mercedes, Audi, Tesla and Toyota are current customers of the new technology, an Nvidia spokesperson confirmed to The Register. ®

Narrower topics

Other stories you might like

  • Nvidia wants to lure you to the Arm side with fresh server bait
    GPU giant promises big advancements with Arm-based Grace CPU, says the software is ready

    Interview 2023 is shaping up to become a big year for Arm-based server chips, and a significant part of this drive will come from Nvidia, which appears steadfast in its belief in the future of Arm, even if it can't own the company.

    Several system vendors are expected to push out servers next year that will use Nvidia's new Arm-based chips. These consist of the Grace Superchip, which combines two of Nvidia's Grace CPUs, and the Grace-Hopper Superchip, which brings together one Grace CPU with one Hopper GPU.

    The vendors lining up servers include American companies like Dell Technologies, HPE and Supermicro, as well Lenovo in Hong Kong, Inspur in China, plus ASUS, Foxconn, Gigabyte, and Wiwynn in Taiwan are also on board. The servers will target application areas where high performance is key: AI training and inference, high-performance computing, digital twins, and cloud gaming and graphics.

    Continue reading
  • Nvidia taps Intel’s Sapphire Rapids CPU for Hopper-powered DGX H100
    A win against AMD as a much bigger war over AI compute plays out

    Nvidia has chosen Intel's next-generation Xeon Scalable processor, known as Sapphire Rapids, to go inside its upcoming DGX H100 AI system to showcase its flagship H100 GPU.

    Jensen Huang, co-founder and CEO of Nvidia, confirmed the CPU choice during a fireside chat Tuesday at the BofA Securities 2022 Global Technology Conference. Nvidia positions the DGX family as the premier vehicle for its datacenter GPUs, pre-loading the machines with its software and optimizing them to provide the fastest AI performance as individual systems or in large supercomputer clusters.

    Huang's confirmation answers a question we and other observers have had about which next-generation x86 server CPU the new DGX system would use since it was announced in March.

    Continue reading
  • Despite global uncertainty, $500m hit doesn't rattle Nvidia execs
    CEO acknowledges impact of war, pandemic but says fundamentals ‘are really good’

    Nvidia is expecting a $500 million hit to its global datacenter and consumer business in the second quarter due to COVID lockdowns in China and Russia's invasion of Ukraine. Despite those and other macroeconomic concerns, executives are still optimistic about future prospects.

    "The full impact and duration of the war in Ukraine and COVID lockdowns in China is difficult to predict. However, the impact of our technology and our market opportunities remain unchanged," said Jensen Huang, Nvidia's CEO and co-founder, during the company's first-quarter earnings call.

    Those two statements might sound a little contradictory, including to some investors, particularly following the stock selloff yesterday after concerns over Russia and China prompted Nvidia to issue lower-than-expected guidance for second-quarter revenue.

    Continue reading

Biting the hand that feeds IT © 1998–2022