The ParlaSpeech-HR dataset is built from parliamentary proceedings available in the Croatian part of the ParlaMint corpus and the parliamentary recordings available from the Croatian Parliament's YouTube channel. The corpus consists of segments 8-20 seconds in length. There are two transcripts available: the original one, and the one normalised via a simple rule-based normaliser. Each of the transcripts contains word-level alignments to the recordings. Each segment has a reference to the ParlaMint 2.1 corpus (http://hdl.handle.net/11356/1432) via utterance IDs. If a segment is based on a single utterance, speaker information for that segment is available as well. There is speaker information available for 381,849 segments, i.e., 95% of all ...
Item does not contain fulltextThe components of the Frisian data collection are speech and language ...
We present a test corpus of audio recordings and transcriptions of presentations of students' enterp...
Automatic speech recognition (ASR) systems require large amounts of transcribed speech data, for tra...
The JuzneVesti-SR dataset consists of audio recordings and manual transcripts from the Južne Vesti w...
ARTUR is a speech database designed for the needs of automatic speech recognition for the Slovenian ...
ARTUR is a speech database designed for the needs of automatic speech recognition for the Slovenian ...
We present a large corpus of Czech parliament plenary sessions. The corpus consists of approximately...
The corpus consists of recordings from the Chamber of Deputies of the Parliament of the Czech Republ...
Automatic Speech Recognition (ASR) models can aid field linguists by facilitating the creation of te...
Funding Information: This work has been supported by the MeMAD project of the European Union’s Horiz...
Nos_ParlaSpeech-GL is an ASR corpus of more than 1,600 hours of automatically aligned speech and tex...
Lahjoita puhetta baseline speech recognition model, built with the Kaldi toolkit. Trained on 1600 ho...
The ParlaMeter-hr corpus contains minutes of the National Assembly of the Republic of Croatia and cu...
Automatic speech recognition (ASR) technology has matured over the past few decades and has made sig...
CL-MASR Dataset This is the dataset used in the continual learning for multilingual ASR (CL-MASR) b...
Item does not contain fulltextThe components of the Frisian data collection are speech and language ...
We present a test corpus of audio recordings and transcriptions of presentations of students' enterp...
Automatic speech recognition (ASR) systems require large amounts of transcribed speech data, for tra...
The JuzneVesti-SR dataset consists of audio recordings and manual transcripts from the Južne Vesti w...
ARTUR is a speech database designed for the needs of automatic speech recognition for the Slovenian ...
ARTUR is a speech database designed for the needs of automatic speech recognition for the Slovenian ...
We present a large corpus of Czech parliament plenary sessions. The corpus consists of approximately...
The corpus consists of recordings from the Chamber of Deputies of the Parliament of the Czech Republ...
Automatic Speech Recognition (ASR) models can aid field linguists by facilitating the creation of te...
Funding Information: This work has been supported by the MeMAD project of the European Union’s Horiz...
Nos_ParlaSpeech-GL is an ASR corpus of more than 1,600 hours of automatically aligned speech and tex...
Lahjoita puhetta baseline speech recognition model, built with the Kaldi toolkit. Trained on 1600 ho...
The ParlaMeter-hr corpus contains minutes of the National Assembly of the Republic of Croatia and cu...
Automatic speech recognition (ASR) technology has matured over the past few decades and has made sig...
CL-MASR Dataset This is the dataset used in the continual learning for multilingual ASR (CL-MASR) b...
Item does not contain fulltextThe components of the Frisian data collection are speech and language ...
We present a test corpus of audio recordings and transcriptions of presentations of students' enterp...
Automatic speech recognition (ASR) systems require large amounts of transcribed speech data, for tra...