Tipsy tongues tell all: How your sloshed speech could snitch to Siri

Alexa, am I wasted?

Wondering if that wee tipple was a bit too much? Someday soon your sloshed speech may spill your secrets to your resident digital assistant as easily as you stumble through a tongue-twister.

A group of researchers from Stanford University in the US and the University of Toronto in Canada have developed an algorithmic method of doing just that. In a paper published this week, the boffins report that they managed to identify alcohol intoxication with 98 percent accuracy by having study participants read tongue-twisters after imbibing a number of vodka gimlets (that's vodka, lime, and a bit of simple syrup for sweetness, to those who haven't been introduced).

"With the proliferation of smartphone sensors, we can now harness digital signals to more accurately predict when drinking episodes happen, enhancing our ability to intervene at the most effective moments," lead author Dr Brian Suffoletto, associate professor of emergency medicine at Stanford, told The Register.

Suffoletto said that he's been working to develop chat-based tools aimed at curbing risky drinking for more than a decade, with the timing of support being the most critical element.

To gather data, participants in the study were served gimlets which were "administered according to standard procedures" (i.e. drunk), with a goal to get users to a breath alcohol concentration above .20 percent, or well into the "very impaired" range. Participants were then asked to read a randomly chosen tongue-twister every hour with a smartphone in front of them on a table – for seven hours.

Commonly known English tongue-twisters were used for the study, like Peter Piper, She Sells Sea Shells, Woodchuck, and Betty Botter, Suffoletto told us.

Speech samples were cleaned up and parsed into one-second segments, and after running them through an algorithm designed to examine spectral and frequency-based voice features, the system spat out results with the aforementioned 98 percent accuracy.

"Our model outperformed the best-performing prior model using the only other known voice recording alcohol corpus we are aware of," the researchers said. That sample [PDF] was from German speakers gathered in 2011, and was 70 percent accurate.

The team in this latest study attributes their improved accuracy to several factors, including a standard set of tongue-twisters that reduced variability between individuals and timepoints in recordings. The team also attributed their method of examining frequency and pitch over "time-based features relating to phonemes and prosody, which may differ greatly between individuals."

More research needed, not to mention privacy worries

While the results might be impressive, even the researchers warn that their study is little more than a proof of concept that needs far more work before it's a viable commercial product.

The science of using a person's voice as a biomarker for alcohol intoxication is a relatively nascent field, the team conceded, and with only 18 data points, the study wasn't nearly large enough to be more than a sign that more drinks need to be served data needs to be gathered.

"We were not able to externally validate our models" beyond the 18-person sample size, the team noted. The entire participant pool was white and non-hispanic, which further limits the generalizability of their findings. It's also possible, they said, that those with more tongue-twister practice could more easily fool such an algorithm, and they're not sure "whether voice signatures would be useful to detect lower risk drinking events," i.e. those who weren't well into the "schwasted" zone.

Besides the technical limitations, the team also isn't sure such tech would be accepted by the public.

"It remains unknown whether individuals would perceive programs that process speech samples as intrusive," the researchers said in their paper. "Therefore, we do not know whether it would be an acceptable method to use in the real world."

Future studies, the team said, need a larger, more mixed participant group, and researchers should give "serious consideration" to partnering with companies like Amazon "that already collect speech samples from smart speakers to test models using real-world data."

Whether such studies will occur isn't clear, though. "There are no current large-scale studies either planned or ongoing that I am aware of," Suffoletto told us. "It would take a larger funder like the NIH to be interested enough to support such an effort."

Imagine a future in which such research did make it to a viable product. You could get into your car after spending a few hours at your local bar, which your smartphone knows because it's been logging your geolocation data. You press the ignition button on your smart vehicle, which is also well aware you were texting your friends with plans to get some drinks, but nothing happens.

"Your location data suggests you may have been drinking. Before you can start your vehicle, [Google Assistant/Siri/etc.] requires that you read the following tongue-twister," your phone chirps. You stumble through Peter Piper Picked a Peck of Pickled Peppers and wait. "Sorry, but you appear intoxicated. Would you like to call an Uber?" ®

More about


Send us news

Other stories you might like