Hands on Microsoft has invested big in its Cognitive Services for programmable artificial intelligence, along with a Bot Framework for using them via a conversational user interface. How easy is it to get started?
Cognitive Services, the AI piece, was announced at the company’s Build developer conference in April 2015. The initial release had just four services: face recognition, speech recognition, visual content recognition, and language understanding. That has now been extended to over 20 APIs.
Note that Cognitive Services, which are pre-baked specialist APIs, are distinct from Azure Machine Learning, which lets you do generalized predictive analytics based on your own data.
A year or so later, at the March 2016 Build event, Microsoft announced the Bot Framework, for building a conversational user interface (still in preview). This links naturally to Cognitive Services since a bot needs some sort of language parsing service. Both services are included (along with a bunch of other stuff) in Microsoft’s overall Machine Learning and AI offering, called Cortana Intelligence Suite.
At the recent QCON software development conference in London, Microsoft’s exhibition stand was focused entirely on Cognitive Services, and it gave a couple of presentations (albeit in the sponsored track) on the subject, though not without glitches. “I don’t know why it hasn’t picked up Seattle as a place,” said the presenter. Note that both the Bot Framework and the Language Understanding Intelligent Service (LUIS) are still in preview.
The main use cases for bots are for sales and customer service. Actions like booking travel or appointments, searching for hotels, and reporting faults are suitable. In most cases interacting with a human is preferable, but also more expensive. Another argument is that the popularity of messaging services means that it pays to have an integrated presence there.
How hard is it to build a bot on Microsoft’s platform? I sat down to build a Reg bot. In this case the main service is to offer content, so to keep things simple I decided the bot should simply search the site for material in response to a query.
There are several moving parts:
LUIS: Your bot has to send text to LUIS for interpretation, which means you have to create and publish a LUIS app.
Bot Framework: Microsoft’s cloud service provides the channels your bot uses to communicate. There are currently 11 channels, including Skype, Facebook Messenger, web page widgets, Direct Line (a REST API direct to your bot), Slack, Microsoft Teams, and SMS via Twilio.
Bing Search API: The bot has to know how to search the Register site; using a search API is the quickest way.
Hosting: A bot is itself a web service, and you have to host your bot somewhere. The tools lead you towards Microsoft Azure, but anywhere that can host an ASP.NET Web API application should do. Your LUIS app also has to be hosted, only on Azure. RegBot uses a free Cognitive Services account and the lowest paid-for web app hosting service.
It makes sense to start with LUIS rather than running up Visual Studio immediately. LUIS is a service that accepts a string of text and parses it into an Intent, along with one or more Entities. You can think of an Intent as a verb and an Entity as a noun. RegBot currently has two Intents, DoSearch and Help, and one Entity, TechSubject.
You set up your LUIS app by typing example text strings that match your Intent and tagging them with their Entities. So “Tell me about malware” becomes Intent: DoSearch TechSubject: Malware. You can test your LUIS app on the page.
Training the RegBot language understanding service
Once you have done the LUIS bit you can get going with the code. I found and installed a Visual Studio Bot Application template, started a new project, restored Nuget packages (which download libraries from Microsoft’s repository), and got an error: “The name GlobalConfiguration does not exist.” A quick search told me to add the WebHost package. That is how development is today; you mix up various pre-built pieces and hope they get along.
Unfortunately, the Bot Application template is not pre-configured for LUIS. My approach was to find another bit of sample code which includes LUIS support and borrow a few pieces from it. I also downloaded the Bot Framework Emulator, which lets you test your bot locally.
I messed around with various App IDs and secret keys to hook up my bot to the LUIS app. A key feature of the Bot Framework is that it keeps track of your conversation by means of a context object, so that your app is able to interact with the user. RegBot does not need much interaction, but to test this I wrote code that asks the user how many results they want to see. It does this with one line of code:
PromptDialog.Number(context, AfterDoSearch, "How many results would you like?");
AfterDoSearch: This is the name of the method which gets called when the user responds. Each type of interaction therefore needs a separate method. This wrapping means state management is taken care of for you – a substantial benefit.
Getting the Bing Search API working took more time than expected. It turns out that a News Search works better than a Web Search, since it has a useful Description field. I also spent time working out how the JSON response was structured; either I missed it, or Microsoft could do with some more basic samples.
The Bot Framework emulator talking to RegBot
I got it working, connected RegBot to Skype, and successfully tested my bot. Publishing it to the wide world involves a few more steps, so not yet. A few thoughts though.
This can work well for simple, well-defined use cases. LUIS is a bit of a black box, but has the advantage that you can see when it goes wrong and try to fix it. Once your app is up and running, it is easy to modify. The ability to code your own bot with a few hours of work is impressive.
That said, none of it is very sophisticated. Throw anything more than a short, simple sentence at LUIS, and it will quickly get confused or give up.
It would not be difficult to add speech recognition and text-to-speech via yet more Cognitive Services, though in the case of RegBot it’s not much use unless it also read web content back to you.
I came into this project as a bot sceptic. It is clever, but not clever enough to be useful other than in a few niche cases, or to hand over to a human after collecting some basic information.
You can imagine eyes lighting up at the thought of replacing call center staff with a few lines of code. Now it is simple to run up a prototype showing why, most of the time, this is probably not a good idea.
More positively, the bot concept is a newish way to interact with users: one that is amenable to voice and therefore handy for in-car use or other scenarios where typing is difficult. It does not feel ready yet, particularly in the case of the difficult AI piece, but give it time. ®