Voice activation systems are used to find a pre-defined word or phrase in the audio stream. Industry solutions, such as “OK, Google” for Android devices, are trained with millions of samples. In this work, we propose and investigate several ways to train a voice activation system when the in-domain data set is small. We compare self-training exemplar pre-training, fine-tuning a model pre-trained on another domain, joint training on both an out-of-domain high-resource and a target low-resource data set, and unsupervised pre-training. In our experiments, the unsupervised pre-training and the joint-training with a high-resource data set from another domain significantly outperform a strong baseline of fine-tuning a model trained on another dat...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
Development of an ASR application such as a speech-oriented guidance system for a real environment i...
AbstractThis work addresses one of the common issues arising when building a speech recognition syst...
The problem of voice activation is to find a pre-defined word in the audio stream. Solutions such as...
In this paper we construct a data set for semi-supervised acous-tic model training by selecting spok...
Modern speech recognition systems exhibits rapid performance degradation under domain shift. This is...
Modern devices are more frequently equipped with voice control. Using speech to operate is a natural...
© 2018 International Speech Communication Association. All rights reserved. Designing a spoken langu...
© 2017 IEEE. Training neural network acoustic models on limited quantities of data is a challenging ...
To obtain a robust acoustic model for a certain speech recognition task, a large amount of speech da...
The development of a speech recognition system requires at least three resources: a large labeled sp...
This work considers training neural networks for speaker recognition with a much smaller dataset siz...
This paper investigates the potential of improving a hybrid automatic speech recognition model train...
This paper describes experiments in using speech data, collected by means of commercial services, in...
At present, specific voice control has gradually become an important means for 5G-IoT-aided industri...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
Development of an ASR application such as a speech-oriented guidance system for a real environment i...
AbstractThis work addresses one of the common issues arising when building a speech recognition syst...
The problem of voice activation is to find a pre-defined word in the audio stream. Solutions such as...
In this paper we construct a data set for semi-supervised acous-tic model training by selecting spok...
Modern speech recognition systems exhibits rapid performance degradation under domain shift. This is...
Modern devices are more frequently equipped with voice control. Using speech to operate is a natural...
© 2018 International Speech Communication Association. All rights reserved. Designing a spoken langu...
© 2017 IEEE. Training neural network acoustic models on limited quantities of data is a challenging ...
To obtain a robust acoustic model for a certain speech recognition task, a large amount of speech da...
The development of a speech recognition system requires at least three resources: a large labeled sp...
This work considers training neural networks for speaker recognition with a much smaller dataset siz...
This paper investigates the potential of improving a hybrid automatic speech recognition model train...
This paper describes experiments in using speech data, collected by means of commercial services, in...
At present, specific voice control has gradually become an important means for 5G-IoT-aided industri...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
Development of an ASR application such as a speech-oriented guidance system for a real environment i...
AbstractThis work addresses one of the common issues arising when building a speech recognition syst...