We present a simple and effective self-supervised learning approach for speech recognition. The approach learns a model to predict the masked speech signals, in the form of discrete labels generated with a random-projection quantizer. In particular the quantizer projects speech inputs with a randomly initialized matrix, and does a nearest-neighbor lookup in a randomly-initialized codebook. Neither the matrix nor the codebook is updated during self-supervised learning. Since the random-projection quantizer is not trained and is separated from the speech recognition model, the design makes the approach flexible and is compatible with universal speech recognition architecture. On LibriSpeech our approach achieves similar word-error-rates as pr...
Recent advances with self-supervised learning have allowed speech recognition systems to achieve sta...
Self-supervised learning (SSL) achieves great success in speech recognition, while limited explorati...
We present a self-learning algorithm using a bottom-up based approach to automatically discover, acq...
Although supervised deep learning has revolutionized speech and audio processing, it has necessitate...
Deep neural networks trained with supervised learning algorithms on large amounts of labeled speech ...
Self-supervised learning via masked prediction pre-training (MPPT) has shown impressive performance ...
Self-supervised speech recognition models require considerable labeled training data for learning hi...
Despite recent advancements in deep learning technologies, Child Speech Recognition remains a challe...
While the general idea of self-supervised learning is identical across modalities, the actual algori...
A novel self-supervised discriminative training method for estimating language models for automatic ...
A novel and computationally straightforward clustering algorithm was developed for vector quantizati...
We investigate the performance of self-supervised pretraining frameworks on pathological speech data...
Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousand...
The use of speech processing applications, particularly speech recognition, has got a lot of attenti...
We present a new Self-Supervised Learning (SSL) approach to pre-train encoders on unlabeled audio da...
Recent advances with self-supervised learning have allowed speech recognition systems to achieve sta...
Self-supervised learning (SSL) achieves great success in speech recognition, while limited explorati...
We present a self-learning algorithm using a bottom-up based approach to automatically discover, acq...
Although supervised deep learning has revolutionized speech and audio processing, it has necessitate...
Deep neural networks trained with supervised learning algorithms on large amounts of labeled speech ...
Self-supervised learning via masked prediction pre-training (MPPT) has shown impressive performance ...
Self-supervised speech recognition models require considerable labeled training data for learning hi...
Despite recent advancements in deep learning technologies, Child Speech Recognition remains a challe...
While the general idea of self-supervised learning is identical across modalities, the actual algori...
A novel self-supervised discriminative training method for estimating language models for automatic ...
A novel and computationally straightforward clustering algorithm was developed for vector quantizati...
We investigate the performance of self-supervised pretraining frameworks on pathological speech data...
Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousand...
The use of speech processing applications, particularly speech recognition, has got a lot of attenti...
We present a new Self-Supervised Learning (SSL) approach to pre-train encoders on unlabeled audio da...
Recent advances with self-supervised learning have allowed speech recognition systems to achieve sta...
Self-supervised learning (SSL) achieves great success in speech recognition, while limited explorati...
We present a self-learning algorithm using a bottom-up based approach to automatically discover, acq...