Microsoft just made another big advancement in artificial intelligence.
Researchers at the company have created speech recognition software that is able to hear language as accurately as humans. The development, detailed in a paper published Monday, marks the most advanced speech recognition software to date, according to Microsoft, who had also set the previous record for speech recognition.
In the study, the software had a word error rate of 5.9 percent, which is about the same as that of human transcribers.
Here’s how Microsoft explains it:
The research milestone doesn’t mean the computer recognized every word perfectly. In fact, humans don’t do that, either. Instead, it means that the error rate – or the rate at which the computer misheard a word like “have” for “is” or “a” for “the” – is the same as you’d expect from a person hearing the same conversation.
Previously, the researchers had reached an error rate of 6.3 percent and had set their sights on reaching human-levels of accuracy next. That was just over a month ago.
The software itself relies on deep neural networks — technology that interprets data in a way similar to how the human brain works — as well as specialized graphics processing units (GPUs) that allow the software to learn at speeds not previously possible.
The milestone has far-reaching implications. On a practical level, it means that Microsoft’s products could soon be a whole lot better at understanding humans. The researchers name Microsoft’s personal assistant app Cortana and the Xbox as two products that could immediately benefit from the research. Accessibility software, such as instant transcription services, could also benefit from the advancement.
The milestone has far-reaching implications
It could also easily be incorporated into Microsoft’s productivity tools like Office — imagine how much better Word’s dictation feature would be with near-human levels of accuracy — or its enterprise offerings.
Consumer products aside, it also marks a turning point for AI research. In a statement, Geoffrey Zweig, from Microsoft’s Speech and Dialog research group, notes that the next phase is to help build software that can not just transcribe human speech but understand it as well. Though that’s a goal that’s much further away, being able to accurately transcribe human speech is a big step forward.