International audienceRecent work on unsupervised contrastive learning of speech representation has shown promising results, but so far has mostly been applied to clean, curated speech datasets. Can it also be used with unprepared audio data "in the wild"? Here, we explore three potential problems in this setting: (i) presence of non-speech data, (ii) noisy or low quality speech data, and (iii) imbalance in speaker distribution. We show that on the Libri-light train set, which is itself a relatively clean speech-only dataset, these problems combined can already have a performance cost of up to 30% relative for the ABX score. We show that the first two problems can be alleviated by data filtering, with voice activity detection selecting spee...
International audienceTraining a speech recognition system needs audio data and their corresponding ...
International audienceContrastive Predictive Coding (CPC), based on predicting future segments of sp...
International audienceRecent progress in self-supervised or unsupervised machine learning has opened...
International audienceWe introduce a new collection of spoken English audio suitable for training sp...
We introduce a new collection of spoken English audio suitable for training speech recognition syste...
14 pages, including references and supplementary materialInternational audienceWe introduce a new un...
Automatic speech recognition (ASR) requires a strong language model to guide the acoustic model and ...
Submitted to Interspeech 2021. arXiv admin note: text overlap with arXiv:2011.11588International aud...
International audienceCross-lingual and multilingual training of Automatic Speech Recognition (ASR) ...
Most recent speech recognition models rely on large supervised datasets, which are unavailable for m...
Automatic speech recognition for our most widely used languages has recently seen substantial impro...
Self-supervised learning (SSL) has been able to leverage unlabeled data to boost the performance of ...
There is growing recognition of the importance of data-centric methods for building machine learning...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
International audienceWe introduce Generative Spoken Language Modeling, the task of learning the aco...
International audienceTraining a speech recognition system needs audio data and their corresponding ...
International audienceContrastive Predictive Coding (CPC), based on predicting future segments of sp...
International audienceRecent progress in self-supervised or unsupervised machine learning has opened...
International audienceWe introduce a new collection of spoken English audio suitable for training sp...
We introduce a new collection of spoken English audio suitable for training speech recognition syste...
14 pages, including references and supplementary materialInternational audienceWe introduce a new un...
Automatic speech recognition (ASR) requires a strong language model to guide the acoustic model and ...
Submitted to Interspeech 2021. arXiv admin note: text overlap with arXiv:2011.11588International aud...
International audienceCross-lingual and multilingual training of Automatic Speech Recognition (ASR) ...
Most recent speech recognition models rely on large supervised datasets, which are unavailable for m...
Automatic speech recognition for our most widely used languages has recently seen substantial impro...
Self-supervised learning (SSL) has been able to leverage unlabeled data to boost the performance of ...
There is growing recognition of the importance of data-centric methods for building machine learning...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
International audienceWe introduce Generative Spoken Language Modeling, the task of learning the aco...
International audienceTraining a speech recognition system needs audio data and their corresponding ...
International audienceContrastive Predictive Coding (CPC), based on predicting future segments of sp...
International audienceRecent progress in self-supervised or unsupervised machine learning has opened...