We introduce factored language models (FLMs) and generalized parallel backoff (GPB). An FLM represents words as bundles of features (e.g., morphological classes, stems, data-driven clusters, etc.), and induces a prob-ability model covering sequences of bundles rather than just words. GPB extends standard backoff to general conditional probability tables where variables might be heterogeneous types, where no obvious natural (temporal) backoff order exists, and where multiple dynamic backoff strategies are allowed. These methodologies were implemented during the JHU 2002 workshop as extensions to the SRI language modeling toolkit. This paper provides initial perplexity results on both CallHome Arabic and on Penn Treebank Wall Street Journal a...
Virtually any modern speech recognition system relies on count-based language models. In this thesis...
Language models are probability distributions over a set of unilingual natural language text used in...
Mixture of Experts layers (MoEs) enable efficient scaling of language models through conditional com...
A language model combining word-based and category-based ngrams within a backoff framework is presen...
When a trigram backoff language model is created from a large body of text, trigrams and bigrams tha...
This paper introduces lattice based language models, a new language modeling paradigm. These models ...
This paper reports on the benefits of largescale statistical language modeling in machine translatio...
This paper describes two techniques for reducing the size of statistical back-off-gram language mode...
In this paper, an extension of n-grams is proposed. In this extension, the memory of the model (n) i...
In this paper, an efficient method for language model look-ahead probability generation is presented...
Though the statistical language modeling plays an important role in speech recognition, there are st...
N-gram language models are an essential component in statistical natural language processing systems...
Abstract—In this paper we investigate different n-gram language models that are defined over an open...
To capture local and global constraints in a language, statistical n-grams are used in combination ...
A language model (LM) is a probability distribution over all possible word sequences. It is a vital ...
Virtually any modern speech recognition system relies on count-based language models. In this thesis...
Language models are probability distributions over a set of unilingual natural language text used in...
Mixture of Experts layers (MoEs) enable efficient scaling of language models through conditional com...
A language model combining word-based and category-based ngrams within a backoff framework is presen...
When a trigram backoff language model is created from a large body of text, trigrams and bigrams tha...
This paper introduces lattice based language models, a new language modeling paradigm. These models ...
This paper reports on the benefits of largescale statistical language modeling in machine translatio...
This paper describes two techniques for reducing the size of statistical back-off-gram language mode...
In this paper, an extension of n-grams is proposed. In this extension, the memory of the model (n) i...
In this paper, an efficient method for language model look-ahead probability generation is presented...
Though the statistical language modeling plays an important role in speech recognition, there are st...
N-gram language models are an essential component in statistical natural language processing systems...
Abstract—In this paper we investigate different n-gram language models that are defined over an open...
To capture local and global constraints in a language, statistical n-grams are used in combination ...
A language model (LM) is a probability distribution over all possible word sequences. It is a vital ...
Virtually any modern speech recognition system relies on count-based language models. In this thesis...
Language models are probability distributions over a set of unilingual natural language text used in...
Mixture of Experts layers (MoEs) enable efficient scaling of language models through conditional com...