This paper introduces a new corpus of read English speech, suitable for training and evaluating speech recognition systems. The Lib-riSpeech corpus is derived from audiobooks that are part of the Lib-riVox project, and contains 1000 hours of speech sampled at 16 kHz. We have made the corpus freely available for download, along with separately prepared language-model training data and pre-built lan-guage models. We show that acoustic models trained on LibriSpeech give lower error rate on the Wall Street Journal (WSJ) test sets than models trained on WSJ itself. We are also releasing Kaldi scripts that make it easy to build these systems
Language models used in current automatic speech recognition systems are trained on general-purpose ...
ition (ASR): precisely on both English and German ASR track. Only primary submissions have been sent...
Unsupervised acoustic modeling can offer a cost and time effective way of creating a solid acoustic ...
We present a test corpus of audio recordings and transcriptions of presentations of students' enterp...
International audienceWe aim at improving spoken language modeling (LM) using very large amount of a...
International audienceThis papers aims at improving spoken language modeling (LM) using very large a...
This paper presents the corpus developed by the LIUM for Automatic Speech Recognition (ASR), based o...
Language Models (LMs) represent a crucial component in the architecture of Automatic Speech Recognit...
We aim at improving spoken language modeling (LM) using very large amount of automatically transcrib...
In this paper we describe the large-scale German broadcast corpus (GER-TV1000h) containing more than...
International audienceSpoken language speech recognition systems need better understanding of natura...
User experience is key to make a computer program successful. If the handling needs a lot of experti...
We describe the protocol used for collecting a corpus of conversational English speech from non-nati...
In this paper, we present CEASR, a Corpus for Evaluating ASR quality. It is a data set derived from ...
The increased availability of broadband connections has recently led to an increase in the use of In...
Language models used in current automatic speech recognition systems are trained on general-purpose ...
ition (ASR): precisely on both English and German ASR track. Only primary submissions have been sent...
Unsupervised acoustic modeling can offer a cost and time effective way of creating a solid acoustic ...
We present a test corpus of audio recordings and transcriptions of presentations of students' enterp...
International audienceWe aim at improving spoken language modeling (LM) using very large amount of a...
International audienceThis papers aims at improving spoken language modeling (LM) using very large a...
This paper presents the corpus developed by the LIUM for Automatic Speech Recognition (ASR), based o...
Language Models (LMs) represent a crucial component in the architecture of Automatic Speech Recognit...
We aim at improving spoken language modeling (LM) using very large amount of automatically transcrib...
In this paper we describe the large-scale German broadcast corpus (GER-TV1000h) containing more than...
International audienceSpoken language speech recognition systems need better understanding of natura...
User experience is key to make a computer program successful. If the handling needs a lot of experti...
We describe the protocol used for collecting a corpus of conversational English speech from non-nati...
In this paper, we present CEASR, a Corpus for Evaluating ASR quality. It is a data set derived from ...
The increased availability of broadband connections has recently led to an increase in the use of In...
Language models used in current automatic speech recognition systems are trained on general-purpose ...
ition (ASR): precisely on both English and German ASR track. Only primary submissions have been sent...
Unsupervised acoustic modeling can offer a cost and time effective way of creating a solid acoustic ...