Training multilingual automatic speech recognition (ASR) systems is challenging because acoustic and lexical information is typically language specific. Training multilingual system for Indic languages is even more tougher due to lack of open source datasets and results on different approaches. We compare the performance of end to end multilingual speech recognition system to the performance of monolingual models conditioned on language identification (LID). The decoding information from a multilingual model is used for language identification and then combined with monolingual models to get an improvement of 50% WER across languages. We also propose a similar technique to solve the Code Switched problem and achieve a WER of 21.77 and 28.27...
<p>This paper describes the integration of language identification (LID) into a multilingual automat...
An independent, automated method of decoding and transcribing oral speech is known as automatic spee...
Abstract: In this paper, phoneme sequences are used as language information to perform code-switched...
We study the effect of applying a language model (LM) on the output of Automatic Speech Recognition ...
Recent methods in speech and language technology pretrain very large models which are fine-tuned for...
We propose a new method for the calculation of error rates in Automatic Speech Recognition (ASR). Th...
Only a handful of the world’s languages are abundant with the resources that enable practical applic...
We present Vakyansh, an end to end toolkit for Speech Recognition in Indic languages. India is home ...
Code-switching (CS) in spoken language is where the speech has two or more languages within an utter...
A cornerstone in AI research has been the creation and adoption of standardized training and test da...
Code-switching deals with alternative languages in communication process. Training end-to-end (E2E) ...
The idea of combining multiple languages’ recordings to train a single automatic speech recognition ...
International audienceSpeakers in multilingual communities often switch between or mix multiple lang...
ABSTRACT Engineering automatic speech recognition (ASR) for speech to speech (S2S) translation syste...
In recent times, the improved levels of accuracy obtained by Automatic Speech Recognition (ASR) tech...
<p>This paper describes the integration of language identification (LID) into a multilingual automat...
An independent, automated method of decoding and transcribing oral speech is known as automatic spee...
Abstract: In this paper, phoneme sequences are used as language information to perform code-switched...
We study the effect of applying a language model (LM) on the output of Automatic Speech Recognition ...
Recent methods in speech and language technology pretrain very large models which are fine-tuned for...
We propose a new method for the calculation of error rates in Automatic Speech Recognition (ASR). Th...
Only a handful of the world’s languages are abundant with the resources that enable practical applic...
We present Vakyansh, an end to end toolkit for Speech Recognition in Indic languages. India is home ...
Code-switching (CS) in spoken language is where the speech has two or more languages within an utter...
A cornerstone in AI research has been the creation and adoption of standardized training and test da...
Code-switching deals with alternative languages in communication process. Training end-to-end (E2E) ...
The idea of combining multiple languages’ recordings to train a single automatic speech recognition ...
International audienceSpeakers in multilingual communities often switch between or mix multiple lang...
ABSTRACT Engineering automatic speech recognition (ASR) for speech to speech (S2S) translation syste...
In recent times, the improved levels of accuracy obtained by Automatic Speech Recognition (ASR) tech...
<p>This paper describes the integration of language identification (LID) into a multilingual automat...
An independent, automated method of decoding and transcribing oral speech is known as automatic spee...
Abstract: In this paper, phoneme sequences are used as language information to perform code-switched...