Tandem acoustic modeling consists of taking the outputs of a neural network discriminantly trained to estimate the phone-class posterior probabilities of speech, and using them as the input features of a conventional distribution-modeling Gaussian mixture model (GMM) speech recognizer, thereby employing two acoustic models in tandem. This structure reduces the error rate on the Aurora 2 noisy English digits task in more than 50% compared to the HTK baseline. Even though there are some reasonable hypothesis to explain this improvement, the origins are still unclear. This paper introduces the use of visualization tools for error analysis of some variations of the tandem system. The error behavior is first analyzed using word-level confusion m...
Local state or phone posterior probabilities are often investigated as local scores (e.g., hybrid HM...
Hidden Markov models (HMMs) have been the mainstream acoustic modelling approach for state-of-the-ar...
This paper describes a hybrid system for continuous speech recognition consisting of a neural networ...
In tandem acoustic modeling, signal features are first processed by a discriminantly-trained neural ...
In the tandem approach to modeling the acoustic signal, a neural-net preprocessor is first discrimin...
Tandem systems transform the cepstral features into posterior probabilities of subword units using a...
Tandem systems transform the cepstral features into posterior probabilities of subword units using a...
The problem we address in this paper is, whether the feature extraction module trained on large amou...
Recent studies have shown that speech recognizers may benefit from data in languages other than the ...
ABSTRACT Hidden Markov model speech recognition systems typically use Gaussian mixture models to est...
We investigate the use of the log-likelihood of the features obtained from a generative Gaussian mix...
We describe some aspects of a Broadcast News recognition system based on hybrid HMM/MLP acoustic mod...
A hybrid connectionist-HMM speech recognizer uses a neural network acoustic classifier. This network...
Posterior based acoustic modeling techniques such as Kullback– Leibler divergence based HMM (KL-HMM)...
The so-called tandem approach, where the posteriors of a multilayer perceptron (MLP) classifier are ...
Local state or phone posterior probabilities are often investigated as local scores (e.g., hybrid HM...
Hidden Markov models (HMMs) have been the mainstream acoustic modelling approach for state-of-the-ar...
This paper describes a hybrid system for continuous speech recognition consisting of a neural networ...
In tandem acoustic modeling, signal features are first processed by a discriminantly-trained neural ...
In the tandem approach to modeling the acoustic signal, a neural-net preprocessor is first discrimin...
Tandem systems transform the cepstral features into posterior probabilities of subword units using a...
Tandem systems transform the cepstral features into posterior probabilities of subword units using a...
The problem we address in this paper is, whether the feature extraction module trained on large amou...
Recent studies have shown that speech recognizers may benefit from data in languages other than the ...
ABSTRACT Hidden Markov model speech recognition systems typically use Gaussian mixture models to est...
We investigate the use of the log-likelihood of the features obtained from a generative Gaussian mix...
We describe some aspects of a Broadcast News recognition system based on hybrid HMM/MLP acoustic mod...
A hybrid connectionist-HMM speech recognizer uses a neural network acoustic classifier. This network...
Posterior based acoustic modeling techniques such as Kullback– Leibler divergence based HMM (KL-HMM)...
The so-called tandem approach, where the posteriors of a multilayer perceptron (MLP) classifier are ...
Local state or phone posterior probabilities are often investigated as local scores (e.g., hybrid HM...
Hidden Markov models (HMMs) have been the mainstream acoustic modelling approach for state-of-the-ar...
This paper describes a hybrid system for continuous speech recognition consisting of a neural networ...