These files are features (i-vectors, x-vectors, MFCC features) extracted from the subset of Mozilla Common Voice corpus version 12.0 used in the VarDial 2023 shared task on Discriminating Between Similar Languages - Speech (DSL-S 2023). This data set contains only the features for the training and development section (and the Common Voice meta data) for the nine languages included in the shared task (see shared task website for further information)
The Linguistic Data Consortium’s Human Subjects Data Collection lab conducts cross-channel speech co...
The AXIOM Voice Dataset has the main purpose of gathering audio recordings from Italian natural lang...
We present the results of the 2nd edition of the Discriminating between Similar Lan-guages (DSL) sha...
These files are features (i-vectors, x-vectors, MFCC features) extracted from the subset of Mozilla ...
CommonLanguage Dataset This dataset is composed of speech recordings from of languages that were c...
This is the evaluation dataset for Task 6 (Subtask B), Language-based Audio Retrieval, in DCASE 2022...
CL-MASR Dataset This is the dataset used in the continual learning for multilingual ASR (CL-MASR) b...
This repository contains the datasets used in the article "Shared Acoustic Codes Underlie Emotional ...
Here are the checkpoints for the trained baseline system and audio encoder for language-based audio ...
This paper presents the compilation of the DSL corpus collection created for the DSL (Discriminating...
We introduce ChannelSet, a dataset which provides a launchpad for exploring the extraneous acoustic ...
ZIP files of folders containing all the datasets (audio file corpora) employed in our research to tr...
Many of the language identification (LID) systems are based on language models using machine learnin...
Datasets used in the article "Shared Acoustic Codes Underlie Emotional Communication in Music and Sp...
This report presents the results of the shared tasks organized as part of the VarDial Evaluation Cam...
The Linguistic Data Consortium’s Human Subjects Data Collection lab conducts cross-channel speech co...
The AXIOM Voice Dataset has the main purpose of gathering audio recordings from Italian natural lang...
We present the results of the 2nd edition of the Discriminating between Similar Lan-guages (DSL) sha...
These files are features (i-vectors, x-vectors, MFCC features) extracted from the subset of Mozilla ...
CommonLanguage Dataset This dataset is composed of speech recordings from of languages that were c...
This is the evaluation dataset for Task 6 (Subtask B), Language-based Audio Retrieval, in DCASE 2022...
CL-MASR Dataset This is the dataset used in the continual learning for multilingual ASR (CL-MASR) b...
This repository contains the datasets used in the article "Shared Acoustic Codes Underlie Emotional ...
Here are the checkpoints for the trained baseline system and audio encoder for language-based audio ...
This paper presents the compilation of the DSL corpus collection created for the DSL (Discriminating...
We introduce ChannelSet, a dataset which provides a launchpad for exploring the extraneous acoustic ...
ZIP files of folders containing all the datasets (audio file corpora) employed in our research to tr...
Many of the language identification (LID) systems are based on language models using machine learnin...
Datasets used in the article "Shared Acoustic Codes Underlie Emotional Communication in Music and Sp...
This report presents the results of the shared tasks organized as part of the VarDial Evaluation Cam...
The Linguistic Data Consortium’s Human Subjects Data Collection lab conducts cross-channel speech co...
The AXIOM Voice Dataset has the main purpose of gathering audio recordings from Italian natural lang...
We present the results of the 2nd edition of the Discriminating between Similar Lan-guages (DSL) sha...