04 May, 2019

Recorded digital audio of interviews and meetings has long been the pain point in the voice-to-text world. Traditional voice recognition software has failed in its attempts to process audio with more than one speaker. Expensive transcription typists was the only viable option to convert spoken voice to text for meetings and interviews, that is until now. Amazon Web Services (AWS) has launched a service called AWS Transcribe. Amazon Transcribe converts audio to text, for single or multi-speaker audio, quickly, cheaply and very accurately all in the cloud.

To show you just how quick and accurate AWS Transcribe is below is a video of an interview between ABC’s Leigh Sales and Australian politician Bill Shorten (short 90-second audio extract) which we processed through Amazon's AWS Transcribe:

As you can see the output text is highly accurate. Each word is tagged with a confidence score and time-stamp for each word (see image below) which allows proofreaders to quickly focus on words that need review. With punctuation automatically added, no need to speak “full stop” or “period” or “comma” only formatting of the text needs to be applied. 

In our post on the iDictate blog, we estimate that AWS Transcribe will cut voice to text conversion costs by 75%-80% while also increasing turnaround times significantly.  This is a big deal for anyone using transcription services, outsource or in-house, in terms of speed and costs. How will this affect the transcription typing industry? 

Here at Dictate Australia, we are in the process of building a solution to accept and process digital audio files which will be automatically converted to text using AWS Transcribe on behalf of our customers, should you not feel comfortable setting up Amazon AWS accounts and services yourself. More on this service to come.

