Automatic speech recognizer software (ASR), e.g., as implemented in voice-activated virtual assistants or other applications, is prone to conflating words that sound similar (homophones). Keyboard-style corrections, e.g., based on the edit-distance of transcribed words, are suboptimal in the presence of such transcription errors. This disclosure describes techniques that predict the N-best speech-to-text transcription alternatives for a given word, wherein the suggested replacements are based on homophones or words with similar sounds. The techniques can be used in any context where automatic speech recognition is used, e.g., to enable correction of commands provided to a virtual assistant, to modify transcribed speech, etc
This disclosure describes techniques that leverage the context of a conversation between a user and ...
The present contribution aims at increasing our understanding of automatic speech recognition (ASR) ...
A regular automatic speech recognizer works with a so-called recognition lexicon. This lexicon conta...
Automatic speech recognizers (ASR) typically treat each utterance of a conversation independently. T...
Automatic speech recognition (ASR) models are used to recognize voice commands or queries from users...
This disclosure describes techniques to correct errors in automatic speech recognition, e.g., as per...
Techniques to improve the process of correcting text transcription of a voice input are described. W...
Automatic speech recognition (ASR) systems currently reach enough performance to be integrated in va...
Selecting the best prediction from a set of candidates is an essential problem for many spoken langu...
It is well-known that human listeners significantly outperform machines when it comes to transcribin...
Modern automatic speech recognition (ASR) systems are speaker independent and designed to recognize ...
Speech recognition system (ASR) is a technology that allows computers receive the input using the sp...
Advancements in deep neural networks have allowed automatic speech recognition (ASR) systems to atta...
This thesis addresses the problems of phonemic variability and confusability from the pronunciation ...
Many application environments have already usedspeech interface. But the low speech recognition rate...
This disclosure describes techniques that leverage the context of a conversation between a user and ...
The present contribution aims at increasing our understanding of automatic speech recognition (ASR) ...
A regular automatic speech recognizer works with a so-called recognition lexicon. This lexicon conta...
Automatic speech recognizers (ASR) typically treat each utterance of a conversation independently. T...
Automatic speech recognition (ASR) models are used to recognize voice commands or queries from users...
This disclosure describes techniques to correct errors in automatic speech recognition, e.g., as per...
Techniques to improve the process of correcting text transcription of a voice input are described. W...
Automatic speech recognition (ASR) systems currently reach enough performance to be integrated in va...
Selecting the best prediction from a set of candidates is an essential problem for many spoken langu...
It is well-known that human listeners significantly outperform machines when it comes to transcribin...
Modern automatic speech recognition (ASR) systems are speaker independent and designed to recognize ...
Speech recognition system (ASR) is a technology that allows computers receive the input using the sp...
Advancements in deep neural networks have allowed automatic speech recognition (ASR) systems to atta...
This thesis addresses the problems of phonemic variability and confusability from the pronunciation ...
Many application environments have already usedspeech interface. But the low speech recognition rate...
This disclosure describes techniques that leverage the context of a conversation between a user and ...
The present contribution aims at increasing our understanding of automatic speech recognition (ASR) ...
A regular automatic speech recognizer works with a so-called recognition lexicon. This lexicon conta...