While the Turkish language is listed among low-resource languages, literature on Turkish automatic speech recognition (ASR) is relatively old. In this paper, we present HuBERT-TR, a speech representation model for Turkish, based on HuBERT. HuBERT-TR achieves state-of-the-art results on several Turkish ASR datasets. We investigate pre-training HuBERT for Turkish with large-scale data curated from online resources. We pre-train HuBERT-TR using over 6,500 hours of speech data curated from YouTube that includes extensive variability in terms of quality and genre. We show that language-specific models are superior to other pre-trained models, where our Turkish model HuBERT-TR/base performs better than the x10 times larger state-of-the-art multil...
International audienceSelf-supervised learning from raw speech has been proven beneficial to improve...
Speech is a paramount means of communication among humans, which makes recognition of the speech by ...
Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousand...
To build automatic speech recognition (ASR) systems with a low word error rate (WER), a large speech...
Automatic speech recognition (ASR) is one of the most important applications of speech and language ...
Significant improvements have been made in automatic speech recognition (ASR) systems in terms of bo...
In this paper we develop a Turkish speech recognition (SR) system using deep neural networks and co...
For decades, the main components of Automatic Speech Recognition (ASR) systems have been pronunciati...
We introduce the Universal Speech Model (USM), a single large model that performs automatic speech r...
Automatic speech recognition (ASR) requires a strong language model to guide the acoustic model and ...
The primary aim of this study was to contribute to the development of multilingual automatic speech ...
In this thesis, we develop deep learning models in automatic speech recognition (ASR) for two contra...
The application of deep neural networks to the task of acoustic modeling for automatic speech recogn...
Most recent speech recognition models rely on large supervised datasets, which are unavailable for m...
Some practical uses of ASR have been implemented, including the transcription of meetings and the us...
International audienceSelf-supervised learning from raw speech has been proven beneficial to improve...
Speech is a paramount means of communication among humans, which makes recognition of the speech by ...
Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousand...
To build automatic speech recognition (ASR) systems with a low word error rate (WER), a large speech...
Automatic speech recognition (ASR) is one of the most important applications of speech and language ...
Significant improvements have been made in automatic speech recognition (ASR) systems in terms of bo...
In this paper we develop a Turkish speech recognition (SR) system using deep neural networks and co...
For decades, the main components of Automatic Speech Recognition (ASR) systems have been pronunciati...
We introduce the Universal Speech Model (USM), a single large model that performs automatic speech r...
Automatic speech recognition (ASR) requires a strong language model to guide the acoustic model and ...
The primary aim of this study was to contribute to the development of multilingual automatic speech ...
In this thesis, we develop deep learning models in automatic speech recognition (ASR) for two contra...
The application of deep neural networks to the task of acoustic modeling for automatic speech recogn...
Most recent speech recognition models rely on large supervised datasets, which are unavailable for m...
Some practical uses of ASR have been implemented, including the transcription of meetings and the us...
International audienceSelf-supervised learning from raw speech has been proven beneficial to improve...
Speech is a paramount means of communication among humans, which makes recognition of the speech by ...
Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousand...