In this paper, we investigate the language models by stochasic context-free grammar (SCFG), bigram and quasi-trigram. For calculating of statistics of bigram and quasi-trigram, we used the set of sentences generated randomly from CFG that are legal in terms of semantics. We compared them on the perplexities for their models and the sentence recognition accuracies. The sentence recognition was experiment-ed in the "UNIX-QA " task with the vocabulary size of 521 words. From these re-sults, the perplexities of bigram and quasi-trigram were about 1.6 times and 1.3 times larger than the perplexity of CFG that corresponds to the most restricted gram-mar (perplexity=10.0), and the perplexity of SCFG is only about 1/2 of CFG. We realized ...
International audienceIn a series of preparatory experiments in 4 languages on subsets of the Europa...
Conventional n-gram language models are well-established as powerful yet simple mechanisms for chara...
Statistical language models (SLMs) for speech recognition have the advantage of robustness, and gram...
This PhD thesis studies the overall effect of statistical language modeling on perplexity and word e...
In this paper several methods are proposed for reducing the size of a trigram language model (LM), w...
In this paper, an extension of n-grams is proposed. In this extension, the memory of the model (n) i...
Previous attempts to automatically determine multi-words as the basic unit for language modeling hav...
Introduction At the current state of the art, high-accuracy speech recognition with moderate to lar...
It seems obvious that a successful model of natural language would incorporate a great deal of both ...
This technical report presents a probabilistic model of English grammar that is based upon "gr...
Statistical language models are widely used in automatic speech recognition in order to constrain th...
Abstract Unsupervised learning algorithms have been derived for several statistical models of Englis...
The CMU Statistical Language Modeling toolkit was released in 1994 in order to facilitate the constr...
this paper appears in Proceedings of the Third International Workshop on Parsing Technologies, 1993
Contains fulltext : 159825.pdf (publisher's version ) (Open Access)In this paper w...
International audienceIn a series of preparatory experiments in 4 languages on subsets of the Europa...
Conventional n-gram language models are well-established as powerful yet simple mechanisms for chara...
Statistical language models (SLMs) for speech recognition have the advantage of robustness, and gram...
This PhD thesis studies the overall effect of statistical language modeling on perplexity and word e...
In this paper several methods are proposed for reducing the size of a trigram language model (LM), w...
In this paper, an extension of n-grams is proposed. In this extension, the memory of the model (n) i...
Previous attempts to automatically determine multi-words as the basic unit for language modeling hav...
Introduction At the current state of the art, high-accuracy speech recognition with moderate to lar...
It seems obvious that a successful model of natural language would incorporate a great deal of both ...
This technical report presents a probabilistic model of English grammar that is based upon "gr...
Statistical language models are widely used in automatic speech recognition in order to constrain th...
Abstract Unsupervised learning algorithms have been derived for several statistical models of Englis...
The CMU Statistical Language Modeling toolkit was released in 1994 in order to facilitate the constr...
this paper appears in Proceedings of the Third International Workshop on Parsing Technologies, 1993
Contains fulltext : 159825.pdf (publisher's version ) (Open Access)In this paper w...
International audienceIn a series of preparatory experiments in 4 languages on subsets of the Europa...
Conventional n-gram language models are well-established as powerful yet simple mechanisms for chara...
Statistical language models (SLMs) for speech recognition have the advantage of robustness, and gram...