The field of speaker and language recognition is constantly being researched and developed, but much of this research is done on private or expensive datasets, making the field more inaccessible than many other areas of machine learning. In addition, many papers make performance claims without comparing their models to other recent research. With the recent development of public multilingual speech corpora such as Mozilla\u27s Common Voice as well as several single-language corpora, we now have the resources to attempt to address both of these problems. We construct an eight-language dataset from Common Voice and a Google Bengali corpus as well as a five-language holdout test set from Audio Lingua. We then compare one filterbank-based model...
The development of a speech recognition system requires at least three resources: a large labeled sp...
Abstract Empirical results have shown that many spoken language identification systems based on hand...
[[abstract]]N-gram language modeling is a crucial component in any speech recognizer since it is exp...
The field of speaker and language recognition is constantly being researched and developed, but much...
| openaire: EC/H2020/780069/EU//MeMADIn this paper, we propose a software toolkit for easier end-to-...
Most state-of-the-art spoken language identification models are closed-set; in other words, they can...
Most recent speech recognition models rely on large supervised datasets, which are unavailable for m...
Master of ScienceDepartment of Computer ScienceWilliam HsuThis research addresses the problem of cod...
End-to-end trainable deep neural networks have become the state-of-the-art architecture for automati...
A cornerstone in AI research has been the creation and adoption of standardized training and test da...
One particular problem in large vocabulary continuous speech recognition for low-resourced languages...
Languages are fundamental to human communication and serve as a means to express social and cultural...
Text-to-speech synthesis (TTS) has progressed to such a stage that given a large, clean, phoneticall...
The objective of this paper is speaker recognition under noisy and unconstrained conditions. We mak...
Many languages identification (LID) systems rely on language models that use machine learning (ML) a...
The development of a speech recognition system requires at least three resources: a large labeled sp...
Abstract Empirical results have shown that many spoken language identification systems based on hand...
[[abstract]]N-gram language modeling is a crucial component in any speech recognizer since it is exp...
The field of speaker and language recognition is constantly being researched and developed, but much...
| openaire: EC/H2020/780069/EU//MeMADIn this paper, we propose a software toolkit for easier end-to-...
Most state-of-the-art spoken language identification models are closed-set; in other words, they can...
Most recent speech recognition models rely on large supervised datasets, which are unavailable for m...
Master of ScienceDepartment of Computer ScienceWilliam HsuThis research addresses the problem of cod...
End-to-end trainable deep neural networks have become the state-of-the-art architecture for automati...
A cornerstone in AI research has been the creation and adoption of standardized training and test da...
One particular problem in large vocabulary continuous speech recognition for low-resourced languages...
Languages are fundamental to human communication and serve as a means to express social and cultural...
Text-to-speech synthesis (TTS) has progressed to such a stage that given a large, clean, phoneticall...
The objective of this paper is speaker recognition under noisy and unconstrained conditions. We mak...
Many languages identification (LID) systems rely on language models that use machine learning (ML) a...
The development of a speech recognition system requires at least three resources: a large labeled sp...
Abstract Empirical results have shown that many spoken language identification systems based on hand...
[[abstract]]N-gram language modeling is a crucial component in any speech recognizer since it is exp...