Self-supervised learning (SSL) has been able to leverage unlabeled data to boost the performance of automatic speech recognition (ASR) models when we have access to only a small amount of transcribed speech data. However, this raises the question of which subset of the available unlabeled data should be selected for transcription. Our work investigates different unsupervised data selection techniques for fine-tuning the HuBERT model under a limited transcription budget. We investigate the impact of speaker diversity, gender bias, and topic diversity on the downstream ASR performance. We also devise two novel techniques for unsupervised data selection: pre-training loss based data selection and the perplexity of byte pair encoded clustered u...
Some practical uses of ASR have been implemented, including the transcription of meetings and the us...
Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image an...
While Automatic Speech Recognition (ASR) models have shown significant advances with the introductio...
In recent years, speech-based self-supervised learning (SSL) has made significant progress in variou...
International audienceTraining a speech recognition system needs audio data and their corresponding ...
Automatic speech recognition (ASR) requires a strong language model to guide the acoustic model and ...
Self-supervised speech recognition models require considerable labeled training data for learning hi...
Advances in self-supervised learning have significantly reduced the amount of transcribed audio requ...
<p>Speaker dependent (SD) ASR systems have significantly lower word error rates (WER) compared to sp...
Speaker dependent (SD) ASR systems have significantly lower word error rates (WER) compared to speak...
Automatic speech recognition (ASR) technology has matured over the past few decades and has made sig...
In active learning for Automatic Speech Recognition (ASR), a portion of data is automatically selec...
High quality transcription data is crucial for training automatic speech recognition (ASR) systems. ...
Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing...
Development of an ASR application such as a speech-oriented guidance system for a real environment i...
Some practical uses of ASR have been implemented, including the transcription of meetings and the us...
Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image an...
While Automatic Speech Recognition (ASR) models have shown significant advances with the introductio...
In recent years, speech-based self-supervised learning (SSL) has made significant progress in variou...
International audienceTraining a speech recognition system needs audio data and their corresponding ...
Automatic speech recognition (ASR) requires a strong language model to guide the acoustic model and ...
Self-supervised speech recognition models require considerable labeled training data for learning hi...
Advances in self-supervised learning have significantly reduced the amount of transcribed audio requ...
<p>Speaker dependent (SD) ASR systems have significantly lower word error rates (WER) compared to sp...
Speaker dependent (SD) ASR systems have significantly lower word error rates (WER) compared to speak...
Automatic speech recognition (ASR) technology has matured over the past few decades and has made sig...
In active learning for Automatic Speech Recognition (ASR), a portion of data is automatically selec...
High quality transcription data is crucial for training automatic speech recognition (ASR) systems. ...
Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing...
Development of an ASR application such as a speech-oriented guidance system for a real environment i...
Some practical uses of ASR have been implemented, including the transcription of meetings and the us...
Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image an...
While Automatic Speech Recognition (ASR) models have shown significant advances with the introductio...