Amazon AWS Transcribe The Nuance Dragon Slayer?

30 December, 2018

Will 2019 be the year that Nuance's domination of voice recognition starts to wane thanks to Amazon's foray into ASR (Automated Speech Recognition)? We think so ..

Amazon AWS Transcribe ASR Automated Speech Recognition Nuance Dragon

Image Credit: Dragonslayer by Lonesome Cowboy on Daz3D.com

With the advent of Amazon's AWS Transcribe in late 2018 it brings with it the prospect of majorly disrupting the speech recognition space. Why? AWS are addressing and solving some of the key voice-to-text issues that Nuance has struggled with for years, in particular:

  • Multi-speaker voice to text e.g. meetings and interviews
  • Streamed real-time audio converted to text
  • No profile training
  • No expensive add-on language packs for specific industries (Nuance, we're looking at Legal and the old medical) 
  • No expensive software to install on your PC (Mac was discontinued by Nuance recently)

Really it is the multi-speaker audio that is the breakthrough for AWS Transcribe. The ability to now take any audio, not even your voice, and have that converted to text. The opportunity here is huge for AWS moving speech recognition out of the traditional legal/medical space and encompassing pretty much every industry.

Some of the examples cited by Amazon include:

Customer Service - When you call a company for help/support your phone calls, which you have always been recorded can now be auto-converted to text, a boon for analytics for any business

Cataloging Audio Archives - think of the amount of audio out there, TV, radio, research, education the list is endless. That audio can now be converted to text and any subject easily found and referenced.

Captioning/Subtitles - Always a pain point for any video producer whether creating live or recorded content, this can now be passed through AWS Transcribe and converted to text and time coded. YouTube creators will love this one.

The key features of AWS Transcribe are:

Easy-to-Read Transcriptions

Most speech recognition systems output a string of text without punctuation. Amazon Transcribe uses deep learning to add punctuation and formatting automatically so that the output is more intelligible and can be used without any further editing.

Timestamp Generation

Amazon Transcribe returns a timestamp for each word, so that you can easily locate the audio in the original recording by searching for the text.

Support for a Wide Range of Use Cases

Amazon Transcribe is designed to provide accurate and automated transcripts for a wide range of audio quality. You can generate subtitles for any video or audio files, and even transcribe low-quality telephony recordings such as customer service calls.

Custom Vocabulary

Amazon Transcribe gives you the ability to expand and customize the speech recognition vocabulary. You can add new words to the base vocabulary and generate highly-accurate transcriptions specific to your use case, such as product names, domain-specific terminology, or names of individuals.

Recognise Multiple Speakers (the game changer)

Amazon Transcribe is able to recognise when the speaker changes and attribute the transcribed text appropriately. This can significantly reduce the amount of work needed to transcribe audio with multiple speakers like telephone calls, meetings, and television shows.

Channel Identification

Amazon Transcribe is able to process audio and video where each speaker is recorded on different channels. Contact centres stand to benefit significantly by submitting a single audio file to Amazon Transcribe, which will identify each channel and produce a single transcript with annotated by channel labels.

AWS Transcribe is extremely easy to use, we have already been testing with audio from the Olympus LS-P4 (perfect for meeting/interview recording) and even Voice Notes on our iOS and Mac devices.

AWS Transcribe requires audio in one of the following formats:

  • mp3, mp4 (m4a), wav and flac 

As always with voice recognition minimise background noise for best results.

call: 1300 88 23 75 ~ twitter: @dictates ~ facebook.com/dictate.australia ~ sales@dictate.com.au - pay by Visa, Mastercard, Amex, Bank Transfer or Purchase Order

payment methods