Oh no, you're thinking, yet another cookie pop-up. Well, sorry, it's the law. We measure how many people read us, and ensure you see relevant ads, by storing cookies on your device. If you're cool with that, hit “Accept all Cookies”. For more info and to customize your settings, hit “Customize Settings”.

Review and manage your consent

Here's an overview of our use of cookies, similar technologies and how to manage them. You can also change your choices at any time, by hitting the “Your Consent Options” link on the site's footer.

Manage Cookie Preferences
  • These cookies are strictly necessary so that you can navigate the site as normal and use all features. Without these cookies we cannot provide you with the service that you expect.

  • These cookies are used to make advertising messages more relevant to you. They perform functions like preventing the same ad from continuously reappearing, ensuring that ads are properly displayed for advertisers, and in some cases selecting advertisements that are based on your interests.

  • These cookies collect information in aggregate form to help us understand how our websites are being used. They allow us to count visits and traffic sources so that we can measure and improve the performance of our sites. If people say no to these cookies, we do not know how many people have visited and we cannot monitor performance.

See also our Cookie policy and Privacy policy.

This article is more than 1 year old

Microsoft's Cognitive Toolkit on GitHub in all its speech-recognising glory

Now go forth – and develop armies of soulless stenographers

Microsoft has released a catalogue of AI software under Microsoft Cognitive Toolkit on GitHub today.

The new toolkit is an updated version of the Computational Network Toolkit, which was developed by a team of computer scientists interested in speech recognition and natural language processing.

It has since expanded into other areas. The 22 APIs cover computer vision, emotion recognition, web search and text analysis, and have been updated to be compatible with C++ and Python.

Microsoft's AI researchers have already used the toolkit to build an automated system capable of recognising recorded speech on the NIST 2000 Switchboard at a word error rate of 5.9 per cent.

Heralded as a "major breakthrough" in speech recognition, the system performs slightly better than the level needed to be a professional transcriptionist. Microsoft are keen to integrate the system into its AI assistant Cortana.

The sudden surge of capabilities in AI has actually been brewing away for over 20 years, Xuedong Huang, Microsoft’s Chief Speech Scientist and a developer of the Microsoft Cognitive Toolkit (MCT), told The Register.

A combination of large datasets, better computer infrastructure and deep learning – something Huang calls "the three pillars" of AI – has led to sudden and significant advances in the field.

In the past it could take up to two months to train a speech recognition model on a single GPU, but using the toolkit it only takes days. Huang credits Microsoft's improving "breakthroughs" in speech recognition to fast training times.

In September, Huang's team announced they had achieved the lowest error rate of computer speech recognition at 6.3 per cent. But a month later, the error rate has decreased to 5.9 per cent and the system has reached "human parity".

More than 20 years ago, the error rate was higher than 60 per cent. Now that AI is rapidly improving, it's important to 'democratise' the technology, Huang told The Register.

He hopes that developers will seize the opportunity to use the toolkit for research or to create new products with novel applications. ®

Similar topics

TIP US OFF

Send us news


Other stories you might like