Getting an AI to understand speech is already a tough nut to crack. A group of Australian researchers wants to take on something much harder: teaching once-deaf babies to talk.
Why so tough?
Think about what happens when you talk to Siri or Cortana or Google on a phone: the speech recognition system has to distinguish your “OK Google” (for example) from background noise; it has to react to “OK Google” rather than “OK something else”; and it has to parse your speech to act on the command.
And you already know how to talk.
The Swinburne University team working on an app called GetTalking can't make even that single assumption, because they're trying to solve a different problem. When a baby receives a cochlear implant to take over the work of their malfunctioning inner ear, he or she needs to learn something brand new: how to associate the sounds they can now hear with the sounds their own mouths make.
Getting those kids started in the world of conversation is a matter of “habilitation” – no “rehabilitation” here, because there isn't a capability to recover.
GetTalking is the brainchild of Swinburne senior lecturer Belinda Barnet, and the genesis of the idea was her own experience as mother to a child with a cochlear implant.
Children interact well with apps. Can one
teach children to talk? Image: Belinda Barnet
As she explained to The Register: “With my own daughter – she had an implant at 11 months old – I could afford to take a year off to teach her to talk. This involves lots of repetitive exercises.“
That time and attention, she explained, is the big predictor of success.
In the roughly 10 years since it became standard practice to provide implants to babies at or before 12 months of age (fully funded by Australia's national health insurance scheme Medicare since 2011), 80 per cent of recipients achieve speech within the normal range.
What defines the 20 per cent that don't get to that point? Inability, either because of family income or distance from the city, to “spend a year sitting on the carpet with flash-cards”.
That makes it hard for parents in rural or regional locations, regional, or low-income mothers, Barnet said.
The idea for which Barnet and associate professor Rachael McDonald sought funding looks simple: an app to run on something like an iPad that gives the baby a bright visual reward for speaking.
However, it does test the boundaries of AI and speech recognition, because of a very difficult starting point: how can an app respond to speech when the baby has never learned to speak?
Speech recognition: ongoing quest
Apple never revealed the price it paid to acquire the team that developed Siri, but rumours of US$150 million don't sound unreasonable – and Siri takes its input from someone who knows how to speak.
For all the effort that's gone into speech recognition and AI, we also know it remains so difficult it's been automated for only a couple of per cent of languages.
Leon Sterling, a Swinburne computer science researcher, had his interest piqued as a member of the university panel assessing the project, and is helping bring a long experience of AI research to the project.
He explained the hidden complexities behind what needs to present itself as a simple app.
“You've got to get the signal, you have to extract the signals, separate them from the background noise, the parents speaking, et cetera.”
Swinburne's Leon Sterling
Most of those problems have precedent, but GetTalking needs yet more machine learning – like trying to measure the child's engagement with the app. “You've got to look at the ability to observe, to tag video strings together with audio strings.”
The team understands that an app can't replace a speech therapist or parent, but only support them – and that adds new complexities like “building in the knowledge of how children interact with physiotherapists. You need to understand the developmental stages of children when they're interacting with the app.”