This article is more than 1 year old

Got bot? How to put it to work with Microsoft's Cortana Skills

This is not how Redmond got devs hooked on Windows

Bringing RegBot to Cortana

The primary route for developing a Cortana skill is to use the Bot Framework, which supports either .NET or Node.js code. If you already have a Bot Framework application, you can simply add Cortana as an additional channel for your bot. Since I already had such an application to hand, I did just that, bringing the RegBot that I developed ealier this year to Cortana – or so I thought.

Cortana is a channel for a Bot Framework application

Cortana is a channel for a Bot Framework application (click to enlarge)

It turned out to be not quite that easy. Once you add a bot to the Cortana channel, it becomes available to any instance of Cortana where you are signed in with the same Microsoft account used for registering the bot. However I had trouble with the invocation name. My first effort was “RegBot”, but when I said to Cortana: “Ask RegBot...” it was interpreted as “Ask Reg but” and failed to connect to my service.

Note that Cortana skills currently only support voice input, so I could not work around this by typing. Text input is planned eventually, according to Microsoft.

In order to make it easier for Cortana, I used “The Register” instead. This worked fine. However, note that while you can use any invocation name you like for testing, there are legal and sanity checks when you want to publish your skill to the wider world.

I could now connect to my skill, but it did not work. It turns out that there are some little differences when you are using Cortana with the Bot Framework – enough to trip you up. One is that Cortana often needs to be told whether or not a message expects a response. I had a wait message, “Thinking for a moment,” that is returned instantly. This made Cortana expect further input from the user, which never came. The solution was to mark the message to ignore input.

Using RegBot in Cortana

Using RegBot in Cortana

Even though RegBot now worked in Cortana, it needed some refinement. By default, text is only displayed on Cortana’s canvas, but if you speak to a bot, you expect it to speak back. You do this by specifying the speak argument in messages sent to Cortana.

Developing RegBot in Visual Studio

Developing RegBot in Visual Studio

Admittedly RegBot is not a useful bot. All it does is a site-specific search. Still, even this simple exercise throws up some issues with Cortana skills as currently implemented. One of the issues is that Cortana is limited in what it can do hands-free. RegBot sends a list of results, for example, each of which has a button that opens a web page. You cannot click on the button using speech. I was able to get Cortana to open the first result in a web browser automatically, but in this case the list of other results disappeared.

This same issue is evident in Cortana's built-in skills. Cortana can launch a desktop application hands-free, but not close it or control it. Cortana does not integrate well with the speech recognition control built into Windows. It is designed to integrate with UWP (Universal Windows Platform) applications, which is why if you try to send an email with Cortana, it will use the Mail app and not Outlook. Cortana also exhibits quirky behaviour.

For example, if you use speech to tell Cortana to “Show desktop”, it will minimize all windows. If you say, “Show the desktop”, it will describe the keyboard shortcut. Why the difference? And if you enable the “Hey Cortana” feature, so that you can start Cortana hands-free, why does it sometimes open full-screen, and sometimes not?

On the plus side, if you develop a Cortana skill you get a certain amount of bot functionality almost for free, including decent speech recognition and a text-to-speech engine. There are limitations – there is only one voice available, for example – but even so, it is worthwhile. Currently there is no cost to the developer for these features, though a presenter at Build remarked that cost is: "Something we’re talking about internally.”

The ability to obtain some user profile information, subject to consent, is also valuable. Whether your bot does customer service or pizza delivery, getting location information is a good start.

In principle, users may use a Cortana skill on any platform including those without a screen, so unless you specifically need to integrate with Windows, best practice is to design a bot that will work entirely within Cortana's canvas and where all interaction can be done by voice.

Developing a Cortana skill is the quickest way to get a speech-enabled bot onto Windows, and – as such – it is worth attention if you have a suitable use case.

The preview has its frustrations and limitations, some of which will be fixed when this hits production. The wider question, though, is whether Microsoft can improve Cortana itself to become more useful than annoying. The platform has to be right before developers will want to extend it. ®

More about

TIP US OFF

Send us news


Other stories you might like