Automatic speech recognition (ASR) systems usually consist of an acoustic model and a language model. This paper describes a technique of an efficient deployment of the acoustic model parameters. The acoustic model typically utilizes Continuous Density Hidden Markov Models (CDHMM). The output probability of a particular CDHMM state is represented by a Gaussian mixture density with a diagonal covariance structure. Usually, the output probability density function of each CDHMM state contains the same number of mixture components although a different number of components in individual states may yield more accurate recognition results, especially for low-resource ASR systems. The central idea is to assign more components to states where it is ...
In conventional hidden Markov model (HMM) based speech recognisers, the emitting HMM states are mode...
Recent advances in automatic speech recognition are accomplished by designing a plug-in maximum a po...
The hypothesis that for a given amount of training data a speaker model has an optimum number of com...
Automatic speech recognition (ASR) systems usually consist of an acoustic model and a language model...
Abstract: Automatic speech recognition (ASR) systems usually consist of an acoustic model and a lang...
We study a category of robust speech recognition problem in which mismatches exist between training ...
HMM-based systems for Automatic Speech Recognition typically model the acoustic features using mixt...
In an HMM based large vocabulary continuous speech recognition system, the evaluation of - context d...
In this thesis, we propose to use techniques based on factor analysis to build acoustic models for a...
recognition problem in which mismatches exist between training and testing conditions, and no accura...
This article provides a unifying Bayesian view on various approaches for acoustic model adaptation, ...
Models dealing directly with the raw acoustic speech signal are an alternative to conventional featu...
In most of state-of-the-art speech recognition systems, Gaussian mixture models (GMMs) are used to ...
Automatic Speech Recognition (ASR) is affected by many variabilities present in the speech signal. D...
Automatic speech recognition (ASR) depends critically on building acoustic models for linguistic uni...
In conventional hidden Markov model (HMM) based speech recognisers, the emitting HMM states are mode...
Recent advances in automatic speech recognition are accomplished by designing a plug-in maximum a po...
The hypothesis that for a given amount of training data a speaker model has an optimum number of com...
Automatic speech recognition (ASR) systems usually consist of an acoustic model and a language model...
Abstract: Automatic speech recognition (ASR) systems usually consist of an acoustic model and a lang...
We study a category of robust speech recognition problem in which mismatches exist between training ...
HMM-based systems for Automatic Speech Recognition typically model the acoustic features using mixt...
In an HMM based large vocabulary continuous speech recognition system, the evaluation of - context d...
In this thesis, we propose to use techniques based on factor analysis to build acoustic models for a...
recognition problem in which mismatches exist between training and testing conditions, and no accura...
This article provides a unifying Bayesian view on various approaches for acoustic model adaptation, ...
Models dealing directly with the raw acoustic speech signal are an alternative to conventional featu...
In most of state-of-the-art speech recognition systems, Gaussian mixture models (GMMs) are used to ...
Automatic Speech Recognition (ASR) is affected by many variabilities present in the speech signal. D...
Automatic speech recognition (ASR) depends critically on building acoustic models for linguistic uni...
In conventional hidden Markov model (HMM) based speech recognisers, the emitting HMM states are mode...
Recent advances in automatic speech recognition are accomplished by designing a plug-in maximum a po...
The hypothesis that for a given amount of training data a speaker model has an optimum number of com...