This article is more than 1 year old
Intel gives away lip-reading speech recognition code
Intel has released lip-reading visual speech recognition software under an open source licence.
Called Audio Visual Speech Recognition (AVSR), the software is part of Intel's OpenCV computer vision and facial recognition code library. Essentially, it tracks the speaker's mouth movements as individual character and syllable sounds are formed. Intel reckons the technique to be far more accurate than traditional speech recognition algorithms, which analyse sounds rather than images.
That's not to say the results are perfect, and Intel's announcement implies that the system works better when coupled with facial recognition to identify 'known' speakers. Indeed, Intel's web site shows that the best results can be achieved with a mix of video and audio recognition algorithms, the one giving weight to the choices made by the other, particularly as the levels of background noise increase.
The code was developed by Intel's Research subsidiary, part of whose remit is to develop applications that make the most of mainstream PCs' processing power. In other words, Intel is developing code that helps encourage users to upgrade to more powerful chips, ideally - and given chip makers' relative market shares, almost certainly - those made by Intel.
It's motives may not be entirely philanthropic, but at least Intel is giving the code away with a minimum of restrictions. ®