Although supervised deep learning has revolutionized speech and audio processing, it has necessitated the building of specialist models for individual tasks and application scenarios. It is likewise difficult to apply this to dialects and languages for which only limited labeled data is available. Self-supervised representation learning methods promise a single universal model that would benefit a wide variety of tasks and domains. Such methods have shown success in natural language processing and computer vision domains, achieving new levels of performance while reducing the number of labels required for many downstream scenarios. Speech representation learning is experiencing similar progress in three main categories: generative, contrast...
International audienceSeveral deep neural networks have recently been shown to generate activations ...
The use of speech processing applications, particularly speech recognition, has got a lot of attenti...
For a language with no transcribed speech available (the zero-resource scenario), conventional acous...
Deep neural networks trained with supervised learning algorithms on large amounts of labeled speech ...
While the general idea of self-supervised learning is identical across modalities, the actual algori...
Self-supervised representation learning methods aim to provide powerful deep feature learning withou...
In this thesis, we analyzed and compared speech representations extracted from different frozen self...
We present a simple and effective self-supervised learning approach for speech recognition. The appr...
International audienceRecent progress in self-supervised or unsupervised machine learning has opened...
Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer. ...
Self supervised representation learning has recently attracted a lot of research interest for both t...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
Methods for extracting audio and speech features have been studied since pioneering work on spectrum...
Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer. ...
International audienceSelf-Supervised Learning (SSL) using huge unlabeled data has been successfully...
International audienceSeveral deep neural networks have recently been shown to generate activations ...
The use of speech processing applications, particularly speech recognition, has got a lot of attenti...
For a language with no transcribed speech available (the zero-resource scenario), conventional acous...
Deep neural networks trained with supervised learning algorithms on large amounts of labeled speech ...
While the general idea of self-supervised learning is identical across modalities, the actual algori...
Self-supervised representation learning methods aim to provide powerful deep feature learning withou...
In this thesis, we analyzed and compared speech representations extracted from different frozen self...
We present a simple and effective self-supervised learning approach for speech recognition. The appr...
International audienceRecent progress in self-supervised or unsupervised machine learning has opened...
Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer. ...
Self supervised representation learning has recently attracted a lot of research interest for both t...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
Methods for extracting audio and speech features have been studied since pioneering work on spectrum...
Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer. ...
International audienceSelf-Supervised Learning (SSL) using huge unlabeled data has been successfully...
International audienceSeveral deep neural networks have recently been shown to generate activations ...
The use of speech processing applications, particularly speech recognition, has got a lot of attenti...
For a language with no transcribed speech available (the zero-resource scenario), conventional acous...