Traditional approaches to language modelling have relied on a fixed corpus of text to inform the parameters of a probability distribution over word sequences. Increasing the corpus size often leads to better-performing language models, but no matter how large, the corpus is a static entity, unable to reflect information about events which postdate it. In these pages we introduce an online paradigm which interleaves the estimation and application of a language model. We present a Bayesian approach to online language modelling, in which the marginal probabilities of a static trigram model are dynamically updated to match the topic being dictated to the system. We also describe the architecture of a prototype we have implemented which uses the...
We propose a probabilistic language model that is intended to overcome some of the limitations of th...
We propose a novel method for using the World Wide Web to ac-quire trigram estimates for statistical...
Training language model made from conversational speech is difficult due to large variation of the w...
Usually, language models are built either from a closed corpus, or by using World Wide Web retrieved...
International audienceWe present a new approach for language modeling based on dynamic Bayesian netw...
In a previous paper we proposed Web-based language models relying on the possibility theory. These m...
We are interested in the problem of learning stochastic language models on-line (without speech tran...
The use of language is one of the defining features of human cognition. Focusing here on two key fea...
In this paper we investigate whether more accurate modeling of differences in language in different ...
Building models of language is a central task in natural language processing. Traditionally, languag...
This thesis contributes to the research domain of statistical language modeling. In this domain, the...
In (Ward and Vega 2008) we examined how how word probabilities vary with time into utterance, and pr...
Most previous work on trainable language generation has focused on two paradigms: (a) using a statis...
Adaptor grammars are a flexible, powerful formalism for defining nonparametric, un-supervised models...
In recent years there has been an increased interest in domain adaptation techniques for statistical...
We propose a probabilistic language model that is intended to overcome some of the limitations of th...
We propose a novel method for using the World Wide Web to ac-quire trigram estimates for statistical...
Training language model made from conversational speech is difficult due to large variation of the w...
Usually, language models are built either from a closed corpus, or by using World Wide Web retrieved...
International audienceWe present a new approach for language modeling based on dynamic Bayesian netw...
In a previous paper we proposed Web-based language models relying on the possibility theory. These m...
We are interested in the problem of learning stochastic language models on-line (without speech tran...
The use of language is one of the defining features of human cognition. Focusing here on two key fea...
In this paper we investigate whether more accurate modeling of differences in language in different ...
Building models of language is a central task in natural language processing. Traditionally, languag...
This thesis contributes to the research domain of statistical language modeling. In this domain, the...
In (Ward and Vega 2008) we examined how how word probabilities vary with time into utterance, and pr...
Most previous work on trainable language generation has focused on two paradigms: (a) using a statis...
Adaptor grammars are a flexible, powerful formalism for defining nonparametric, un-supervised models...
In recent years there has been an increased interest in domain adaptation techniques for statistical...
We propose a probabilistic language model that is intended to overcome some of the limitations of th...
We propose a novel method for using the World Wide Web to ac-quire trigram estimates for statistical...
Training language model made from conversational speech is difficult due to large variation of the w...