In this paper we construct a data set for semi-supervised acous-tic model training by selecting spoken utterances from a massive collection of anonymized Google Voice Search utterances. Semi-supervised training usually retains high-confidence utterances which are presumed to have an accurate hypothesized transcript, a neces-sary condition for successful training. Selecting high confidence ut-terances can however restrict the diversity of the resulting data set. We propose to introduce a constraint enforcing that the distribution of the context-dependent state symbols obtained by running forced alignment of the hypothesized transcript matches a reference distri-bution estimated from a curated development set. The quality of the obtained trai...
The paper revives an older approach to acoustic modeling that borrows from n-gram language modeling ...
Visual information in the form of lip movements of the speaker has been shown to improve the perform...
There is growing recognition of the importance of data-centric methods for building machine learning...
Voice activation systems are used to find a pre-defined word or phrase in the audio stream. Industry...
To obtain a robust acoustic model for a certain speech recognition task, a large amount of speech da...
Current approaches to semi-supervised incremental learning prefer to select unlabeled examples predi...
We conduct a comparative study on selecting subsets of acous-tic data for training phone recognizers...
Current approaches to semi-supervised incremental learning prefer to select unlabeled examples pred...
This paper investigates improving lightly supervised acous-tic model training for an archive of broa...
This paper presents an extended study in the topic of optimal selection of speech data from a databa...
Accented speech that is under-represented in the training data still suffers high Word Error Rate (W...
© 2017 IEEE. Training neural network acoustic models on limited quantities of data is a challenging ...
INTERSPEECH2006: the 9th International Conference on Spoken Language Processing (ICSLP), September 1...
Accented speech that is under-represented in the training data still suffers high Word Error Rate (W...
This paper deals with the task of statistical machine transla-tion of spontaneous speech using a lim...
The paper revives an older approach to acoustic modeling that borrows from n-gram language modeling ...
Visual information in the form of lip movements of the speaker has been shown to improve the perform...
There is growing recognition of the importance of data-centric methods for building machine learning...
Voice activation systems are used to find a pre-defined word or phrase in the audio stream. Industry...
To obtain a robust acoustic model for a certain speech recognition task, a large amount of speech da...
Current approaches to semi-supervised incremental learning prefer to select unlabeled examples predi...
We conduct a comparative study on selecting subsets of acous-tic data for training phone recognizers...
Current approaches to semi-supervised incremental learning prefer to select unlabeled examples pred...
This paper investigates improving lightly supervised acous-tic model training for an archive of broa...
This paper presents an extended study in the topic of optimal selection of speech data from a databa...
Accented speech that is under-represented in the training data still suffers high Word Error Rate (W...
© 2017 IEEE. Training neural network acoustic models on limited quantities of data is a challenging ...
INTERSPEECH2006: the 9th International Conference on Spoken Language Processing (ICSLP), September 1...
Accented speech that is under-represented in the training data still suffers high Word Error Rate (W...
This paper deals with the task of statistical machine transla-tion of spontaneous speech using a lim...
The paper revives an older approach to acoustic modeling that borrows from n-gram language modeling ...
Visual information in the form of lip movements of the speaker has been shown to improve the perform...
There is growing recognition of the importance of data-centric methods for building machine learning...