DevOps

Eee by gum! Aye up, Microsoft, what's tha y' got? Cloud for accents?

Sorry – t'cloud for accents?

15 Got Tips?

Microsoft has build a cloud service for applications so that software can attempt to understand specialist vocabularies and cope with dialects and accents.

Speech recognition works better if the algorithms can pick from a limited range of possible words and phrases, rather than attempting to recognise everything. Microsoft's new Custom Speech Service, which is part of the corporation's Cognitive Services suite, lets you upload examples of what you expect users to say. You can also assign different weightings, so that when the system is choosing between two possible interpretations of someone's speech input, it can be guided about which is more likely. This is called a Custom Language Model.

The service also supports custom acoustic models, which you create by uploading sound files together with their transcriptions. This is one way of training the system to deal with accents and dialects.

This kind of customisation makes a huge difference to the likely success of applications that support speech input.

Microsoft also announced that two services already in preview will be generally available in March. These are the Content Moderator, for detecting profanity and porn in text, video and images, and the Bing Speech API, for generic speech-to-text and text-to-speech services.

The Custom Speech Service uses the same API as the Bing Speech Service. Usage is a matter of first configuring your service on Microsoft Azure, and then calling it from a REST API or using a client library, available for .NET, Java (for Android) and Objective C.

Importing the speech recognition library into an application in Visual Studio - note the reference to Project Oxford, the code name for Cognitive Services

Microsoft has been hyping its Cognitive Services, which now include 25 different APIs, for some time. In principle, there is plenty of potential for applications that support new kinds of interaction and automate tasks which would otherwise require human intervention.

The reality often falls short, though. At various events Microsoft has demonstrated a machine that guesses your age; I find I can take ten years off by removing my glasses. It has also shown a crude emotion detector, which is easily fooled by fake smiles or frowns.

Some customer support lines now use speech recognition to automate routing your call to the right person; it is often no better and sometimes worse than the old method of "press 1" for this and "press 2" for that.

The technology is improving, though, and voice-powered services such as Apple's Siri, Amazon's Alexa, Google Now and Microsoft Cortana have done a lot to familiarise users with what is possible.

Supporting voice control in an application without the use of a cloud service would be impossibly hard. Microsoft's API makes it relatively easy.

The Custom Speech Service is free for one concurrent request and up to 5,000 per month. After that it costs from $11.29 per day. Full details of the service and pricing are here. ®

Sign up to our NewsletterGet IT in your inbox daily

15 Comments

Keep Reading

Microsoft to Cortana: you’re not going out dressed in iOS or Android, young lady!

Third-party skills and smart speaker incarnations also discontinued

Hey Cortana... I mean Google: Microsoft's ex-digital assistant boss to take the reins at G Suite

Javier Soltero looks forward to making a 'profound impact' on people's lives

Cortana, why are you still here? Microsoft makes the long-suffering assistant chattier for more countries with new Windows 10 build

Also: Remember that whole 'final build' thing? Here's a patch

Microsoft's Cortana turns its back on consumers as skills are stripped from Windows 10

Unloved assistant to smarten up its act in Microsoft 365. US only, naturally

You: 'Alexa, open Cortana.' Alexa: 'Who?'

Updated A year on, Alexa can look at your emails and Cortana can order groceries. World shrugs

World Health Organisation AI chatbot goes deaf when asked for the latest COVID-19 figures for Taiwan, Hong Kong

Funny that!

If you're on invite-only tech-testing scheme, take care with Amazon's Alexa-powered answer to Google's Glass

iFixit reveals repair won't be trivial

Microsoft dropkicks Cortana with Skype functionality on Alexa

Plus: Cloud file-sharing on desktop and mobile clients

Google says its latest chatbot is the most human-like ever – trained on our species' best works: 341GB of social media

Although Meena makes sense, most of the time, color us skeptical of a scoring system devised by web giant

Amazon's auditing of Alexa Skills is so good, these boffins got all 200+ rule-breaking apps past the reviewers

Want your AI assistant to shout obscenities or hate speech at your child? There's a program for that

Tech Resources

Ransomware Playbook

Ransomware is a unique security threat where most of the security team’s effort is spent on prevention and response because once ransomware is detected, it's too late.

Simplifying Hybrid Cloud Flash Storage

According to industry analysts, a critical element for secure hybrid multicloud environments is the storage infrastructure.

Navigating the New Era of Cloud Computing

Hear from Steve Sibley, VP of Offering Management for IBM Power Systems about how IBM Power Systems can enable hybrid cloud environments that support “build once, deploy anywhere” options.

Why Data Growth is Not a Storage Problem

Storage capacity’s running out, backups lengthen, and budgets can’t keep up with the unstructured data deluge. Learn how Komprise's Intelligent Data Management can help you …