Abstract We present our study on semi-supervised Gaussian mixture model (GMM) hidden Markov model (HMM) and deep neural network (DNN) HMM acoustic model training. We analyze the impact of transcription quality and data sampling approach on the performance of the resulting model, and propose a multisystem combination and confidence re-calibration approach to improve the transcription inference and data selection. Compared to using a single system recognition result and confidence score, our proposed approach reduces the phone error rate of the inferred transcription by 23.8% relatively when top 60% of data are selected. Experiments were conducted on the mobile short message dictation (SMD) task. For the GMM-HMM model, we achieved 7.2% relati...
Hidden Markov models (HMMs) have been the mainstream acoustic modelling approach for state-of-the-ar...
In this work we assess the recently proposed hybrid Deep Neural Network/Gaussian Mixture Model (DNN/...
In this paper, we investigate employment of discriminatively trained acoustic features modeled by Su...
Gaussian Mixture Model-Hidden Markov Models (GMM-HMMs) are the state-of-the-art for acoustic modelin...
Training acoustic models for ASR requires large amounts of labelled data which is costly to obtain. ...
In conventional hidden Markov model (HMM) based speech recognisers, the emitting HMM states are mode...
In this paper, we investigate semi-supervised training (SST) method in various state-of-the-art acou...
We investigate two strategies to improve the context-dependent deep neural network hidden Markov mod...
Training acoustic models for ASR requires large amounts of labelled data which is costly to obtain. ...
Recently, context-dependent (CD) deep neural network (DNN) hidden Markov models (HMMs) have been wid...
Speech recognition applications are known to require a significant amount of resources (training dat...
In this paper, we present several methods for mapping recognition engine requirements to mobile phon...
Acoustic modeling in state-of-the-art speech recognition systems usually relies on hidden Markov mod...
Hybrid deep neural network-hidden Markov model (DNN-HMM) systems have become the state-of-the-art in...
We investigate two strategies to improve the context-dependent deep neural network hidden Markov mod...
Hidden Markov models (HMMs) have been the mainstream acoustic modelling approach for state-of-the-ar...
In this work we assess the recently proposed hybrid Deep Neural Network/Gaussian Mixture Model (DNN/...
In this paper, we investigate employment of discriminatively trained acoustic features modeled by Su...
Gaussian Mixture Model-Hidden Markov Models (GMM-HMMs) are the state-of-the-art for acoustic modelin...
Training acoustic models for ASR requires large amounts of labelled data which is costly to obtain. ...
In conventional hidden Markov model (HMM) based speech recognisers, the emitting HMM states are mode...
In this paper, we investigate semi-supervised training (SST) method in various state-of-the-art acou...
We investigate two strategies to improve the context-dependent deep neural network hidden Markov mod...
Training acoustic models for ASR requires large amounts of labelled data which is costly to obtain. ...
Recently, context-dependent (CD) deep neural network (DNN) hidden Markov models (HMMs) have been wid...
Speech recognition applications are known to require a significant amount of resources (training dat...
In this paper, we present several methods for mapping recognition engine requirements to mobile phon...
Acoustic modeling in state-of-the-art speech recognition systems usually relies on hidden Markov mod...
Hybrid deep neural network-hidden Markov model (DNN-HMM) systems have become the state-of-the-art in...
We investigate two strategies to improve the context-dependent deep neural network hidden Markov mod...
Hidden Markov models (HMMs) have been the mainstream acoustic modelling approach for state-of-the-ar...
In this work we assess the recently proposed hybrid Deep Neural Network/Gaussian Mixture Model (DNN/...
In this paper, we investigate employment of discriminatively trained acoustic features modeled by Su...