Microsoft’s Cortana voice-controlled digital assistant is being opened up so apps can use it for processing spoken commands, according to a presentation at the US giant's WinHEC event in Shenzhen, China.
Cortana has several party tricks. One is to understand your speech, using a voice-recognition engine built into Windows. Another is to hook into applications like contacts, calendar, tasks, music and maps, in order to call phone numbers, create appointments and reminders, play songs, and get directions. Cortana is also a user interface for search, and anything Cortana cannot deal with directly performs a Bing search in the cloud.
Cortana’s “Notebook” is where you customise the service with a database of personal interests. For example, if you add the Trip Planner interest, Cortana will detect flight itineraries in your email, and alert you of the status five hours before. You can also add favourite sports teams, celebrities, companies, news interests, companies in which you hold shares, and so on, and Cortana will show you a personalised news feed. Another feature is “quiet hours”, which on a phone will send all calls to voicemail unless the calls come from an inner circle of approved contacts.
Developed first for Windows Phone, Cortana is now part of Windows 10 for PCs, too. A Cortana app may also come to iOS and Android, though not necessarily with the same level of integration and extensibility.
Now applications can tap into the Cortana software, and use it to take spoken instructions. At WinHEC, attendees were told that they could:
- Add voice commands to Cortana to interact with third-party apps, using the Universal App Platform (UAP), including web-hosted apps (essentially web applications with access to device features).
- Interact with the Cortana canvas, including “UI templates” so you can show images, respond to button clicks or taps, enable text input and so on.
- Register to handle built-in Cortana tasks. This presumably means that you can have your own app override the default Cortana action for things like playing music.
Cortana recommends a tech story for me
Since WinHEC is a hardware conference, there was also a focus on how to create a satisfactory audio system for Cortana to operate successfully, including the microphone design. Anyone with experience of voice recognition will confirm that using a high-quality microphone makes all the difference.
Will developers take up this opportunity? There are several angles on this. One is that although voice input right now is pretty useless in many environments – such as a typical open-plan office – it can be a great feature in certain scenarios, for example when driving, or as an assistive technology in cases where keyboard, mouse or touch control is difficult. (Chinese web giant Baidu claims it has solved the noise problem: its Deep Speech engine can pick out words spoken in a busy restaurant, we're told.)
Voice recognition is never perfect, though, and the broader the range of commands, the more likely you are to get errors.
At WinHEC, attended by Windows OEMs, minds will be turning towards how vendors can, ahem, improve your experience by customising Cortana on new PCs. If every command mysteriously turns into a trigger for advertising some service or other, you will have the Cortana speech platform to blame.
Slides from the WinHEC session are available here, possibly to be followed by a session video when available. ®