Meet Microsoft Human-Like speech recognition
MICROSOFT RESEARCHERS have claimed that the latest generation of speech recognition from the company's labs is on a par with that of a human being.
The team showed in a paper published yesterday entitled Achieving Human Parity in Conversational Speech Recognition that it had reached a word error rate of just 5.9 per cent, which is equivalent to an average human being or seven Joey Essex.
The paper is very involved so we’ll leave you to peruse it at your leisure, but the gist is that this is an incredibly big deal and one of the holy grails towards machine learning and artificial intelligence, as well as a massive boost to those with disabilities who perhaps can’t use a keyboard.
A number of other companies, such as Google (of course) and Nuance, have demonstrated huge leaps and bounds in the use of deep learning to bolster the ability for machines to interpret human speech using a knowledge of context and accent to ensure the right result.
Nowhere is that more obvious than the game-changing Amazon Echo, which has brought speech-controlled computing to the home for fifty quid.
Xuedong Huang, Microsoft’s chief speech scientist, told The Next Web: “We’ve reached human parity. This is a historic achievement.”
The message is clear, but parity with humans isn’t perfection as humans are far from perfect. You have only to watch the Four Candles sketch in The Two Ronnies for that.
In fact, maybe that could be some sort of equivalent to the Turing Test for speech recognition.
Anyway, next on the hit list will be improving the accuracy beyond human levels, and making it easier to cope with noisy environments, multiple voices and loud music.
Neural networks, training sets, audio and video have come together to reach this point, but there’s still a way to go before you can rely on Cortana to 'Book Limp Bizkit tickets' without watching to make sure it hasn’t ordered a pallet of Jammy Dodgers. µ
and it's more modern remake....
Post a Comment