Minimum Classification Error (MCE) training is difficult to apply to language modeling due to inherent scarcity of training data (N-best lists). However, a whole-sentence exponential language model is particularly suitable for MCE training, because it can use a relatively small number of powerful features to capture global sentential phenomena. We review the model, discuss feature induction, find features in both the Broadcast News and Switchboard domains, and build an MCE-trained model for the latter. Our experiments show that even models with relatively few features are prone to overfitting and are sensitive to initial parameter setting, leading us to examine alternative weight optimization criteria and search algorithms
The model training algorithm is a critical component in the statistical pattern recognition approach...
In this paper, we present novel techniques for performing topic adaptation on an n-gram language mod...
An automatic speech recognition system has two main components, the front end feature processing com...
Minimum Classification Error (MCE) training is difficult to apply to language modeling due to inhere...
We introduce an exponential language model which mod-els a whole sentence or utterance as a single u...
In this paper, we present novel techniques for performing topic adaptation on an -gram language mode...
AbstractThe Minimum Classification Error (MCE) criterion is a well-known criterion in pattern classi...
The Minimum Classification Error (MCE) criterion is a well-known criterion in pattern classification...
Abstract—The minimum classification error (MCE) framework for discriminative training is a simple an...
Abstract. The Minimum Phone Error (MPE) criterion for discriminative training was shown to be able t...
For languages with fast vocabulary growth and limited resources, data sparsity leads to challenges i...
For languages with fast vocabulary growth and limited resources, data sparsity leads to challenges i...
This paper considers discriminative training of language models for large vocabulary continuous spee...
Shigeru Katagiri and various co-authors have (re)introduced a nonstandard error mea-sure which can b...
During minimum-classification-error (MCE) training, competing hypotheses against the correct one are...
The model training algorithm is a critical component in the statistical pattern recognition approach...
In this paper, we present novel techniques for performing topic adaptation on an n-gram language mod...
An automatic speech recognition system has two main components, the front end feature processing com...
Minimum Classification Error (MCE) training is difficult to apply to language modeling due to inhere...
We introduce an exponential language model which mod-els a whole sentence or utterance as a single u...
In this paper, we present novel techniques for performing topic adaptation on an -gram language mode...
AbstractThe Minimum Classification Error (MCE) criterion is a well-known criterion in pattern classi...
The Minimum Classification Error (MCE) criterion is a well-known criterion in pattern classification...
Abstract—The minimum classification error (MCE) framework for discriminative training is a simple an...
Abstract. The Minimum Phone Error (MPE) criterion for discriminative training was shown to be able t...
For languages with fast vocabulary growth and limited resources, data sparsity leads to challenges i...
For languages with fast vocabulary growth and limited resources, data sparsity leads to challenges i...
This paper considers discriminative training of language models for large vocabulary continuous spee...
Shigeru Katagiri and various co-authors have (re)introduced a nonstandard error mea-sure which can b...
During minimum-classification-error (MCE) training, competing hypotheses against the correct one are...
The model training algorithm is a critical component in the statistical pattern recognition approach...
In this paper, we present novel techniques for performing topic adaptation on an n-gram language mod...
An automatic speech recognition system has two main components, the front end feature processing com...