National audienceThis study presents a large scale benchmarking on cloudbased Speech-To-Text systems : Google Cloud Speech-To-Text, Microsoft Azure Cognitive Services, Amazon Transcribe, IBM Watson Speech to Text. For each systems, 40 158 clean and noisy speech files about 101 hours are tested. Effect of background noise on STT quality is also evaluated with 5 different Signal-to-noise ratios from 40 dB to 0 dB. Results showed that Microsoft Azure provided lowest transcription error rate 9.09% on clean speech, with high robustness to noisy environment. Google Cloud and Amazon Transcribe gave similar performance, but the latter is very limited for time-constraint usage. Though IBM Watson could work correctly in quiet conditions, it is highly...
Automatic Speech Recognition (ASR) is an essential task for many ap- plications like automatic capti...
Conducting “manual” transcriptions and analyses is unsustainable for most historical oral archives b...
The development of large-scale automatic classroom dialog analysis systems requires accurate speech-...
This study presents a large scale benchmarking on cloud based Speech-To-Text systems: {Google Cloud ...
The use of speech recognition on mobile devices has been possible with the development of cloud syst...
This document compares out-of-box performance of three commercially available speech recognition sof...
Deep learning technology has encouraged research on noise-robust automatic speech recognition (ASR)....
The technology which enables the recognition and transcription of spoken language to text by compute...
Abstract—We perform an experimental evaluation of two popular cloud-based speech recognition systems...
This thesis describes the comparison of two Automatic Speech Recognition (ASR) systems, used in the ...
Abstract—Interactive real-time communication between people and machine enables innovations in trans...
As the automatic speech recognition technology is becoming more advanced, the possibilities of in wh...
The collection and transcription of speech data is typically an expensive and time-consuming task. V...
Speech recognition has gained much attention from researchers for almost last two decades. Isolated ...
The robustness and consistency of sensory inference models under changing environmental conditions a...
Automatic Speech Recognition (ASR) is an essential task for many ap- plications like automatic capti...
Conducting “manual” transcriptions and analyses is unsustainable for most historical oral archives b...
The development of large-scale automatic classroom dialog analysis systems requires accurate speech-...
This study presents a large scale benchmarking on cloud based Speech-To-Text systems: {Google Cloud ...
The use of speech recognition on mobile devices has been possible with the development of cloud syst...
This document compares out-of-box performance of three commercially available speech recognition sof...
Deep learning technology has encouraged research on noise-robust automatic speech recognition (ASR)....
The technology which enables the recognition and transcription of spoken language to text by compute...
Abstract—We perform an experimental evaluation of two popular cloud-based speech recognition systems...
This thesis describes the comparison of two Automatic Speech Recognition (ASR) systems, used in the ...
Abstract—Interactive real-time communication between people and machine enables innovations in trans...
As the automatic speech recognition technology is becoming more advanced, the possibilities of in wh...
The collection and transcription of speech data is typically an expensive and time-consuming task. V...
Speech recognition has gained much attention from researchers for almost last two decades. Isolated ...
The robustness and consistency of sensory inference models under changing environmental conditions a...
Automatic Speech Recognition (ASR) is an essential task for many ap- plications like automatic capti...
Conducting “manual” transcriptions and analyses is unsustainable for most historical oral archives b...
The development of large-scale automatic classroom dialog analysis systems requires accurate speech-...