Self-supervised speech recognition models require considerable labeled training data for learning high-fidelity representations for Automatic Speech Recognition (ASR) which is computationally demanding and time-consuming, thereby hindering the usage of these models in resource-constrained environments. We consider the task of identifying an optimal subset of data to train self-supervised speech models for ASR. We make a surprising observation that the dataset pruning strategies used in vision tasks for sampling the most informative examples do not perform better than random subset selection on the task of fine-tuning self-supervised ASR. We then present the COWERAGE algorithm for better subset selection in self-supervised ASR, which is base...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
The modern paradigm in speech processing has demonstrated the importance of scale and compute for en...
We present a simple and effective self-supervised learning approach for speech recognition. The appr...
We investigate the performance of self-supervised pretraining frameworks on pathological speech data...
Advances in self-supervised learning have significantly reduced the amount of transcribed audio requ...
Despite recent advancements in deep learning technologies, Child Speech Recognition remains a challe...
Although supervised deep learning has revolutionized speech and audio processing, it has necessitate...
In recent years, speech-based self-supervised learning (SSL) has made significant progress in variou...
Self-supervised pre-training could effectively improve the performance of low-resource automatic spe...
Self-supervised learning (SSL) has been able to leverage unlabeled data to boost the performance of ...
Self-supervised learning via masked prediction pre-training (MPPT) has shown impressive performance ...
We employ a combination of recent developments in semi-supervised learning for automatic speech reco...
Self-supervised speech models have grown fast during the past few years and have proven feasible for...
We apply transfer learning to the task of phoneme segmentation and demonstrate the utility of repres...
Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
The modern paradigm in speech processing has demonstrated the importance of scale and compute for en...
We present a simple and effective self-supervised learning approach for speech recognition. The appr...
We investigate the performance of self-supervised pretraining frameworks on pathological speech data...
Advances in self-supervised learning have significantly reduced the amount of transcribed audio requ...
Despite recent advancements in deep learning technologies, Child Speech Recognition remains a challe...
Although supervised deep learning has revolutionized speech and audio processing, it has necessitate...
In recent years, speech-based self-supervised learning (SSL) has made significant progress in variou...
Self-supervised pre-training could effectively improve the performance of low-resource automatic spe...
Self-supervised learning (SSL) has been able to leverage unlabeled data to boost the performance of ...
Self-supervised learning via masked prediction pre-training (MPPT) has shown impressive performance ...
We employ a combination of recent developments in semi-supervised learning for automatic speech reco...
Self-supervised speech models have grown fast during the past few years and have proven feasible for...
We apply transfer learning to the task of phoneme segmentation and demonstrate the utility of repres...
Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
The modern paradigm in speech processing has demonstrated the importance of scale and compute for en...
We present a simple and effective self-supervised learning approach for speech recognition. The appr...