N-gram language models are an essential component in statistical natural language processing systems for tasks such as machine translation, speech recognition, and optical character recognition. They are also re-sponsible for much of the computational costs. This thesis contributes efficient algorithms for three language modeling problems: estimating probabilities from corpora, representing a model in memory, and searching for high-scoring output when log language model probability is part of the score. Most existing language modeling toolkits operate in RAM, effectively limiting model size. This work contributes disk-based streaming algorithms that use a configurable amount of RAM to estimate Kneser-Ney language models 7.13 times as fast a...
The quality of translations produced by statistical machine translation (SMT) systems crucially depe...
Language models are probability distributions over a set of unilingual natural language text used in...
Statistical machine translation, the task of translating text from one natural language into another...
Statistical machine translation, as well as other areas of human language processing, have recentl...
Research in speech recognition and machine translation is boosting the use of large scale n-gram lan...
This paper reports on the benefits of largescale statistical language modeling in machine translatio...
Machine translation is the discipline concerned with developing automated tools for translating fro...
This paper deals with the two fundamental problems concerning the handling of large n-gram language ...
2014-07-28The goal of machine translation is to translate from one natural language into another usi...
In recent years neural language models (LMs) have set state-of-the-art performance for several bench...
Approximate search algorithms, such as cube pruning in syntactic machine translation, rely on the la...
We contribute 5-gram counts and language models trained on the Common Crawl corpus, a collection ove...
Simple and Efficient Model Filtering in Statistical Machine Translation Data availability and distri...
Statistical n-gram language modeling is used in many domains like speech recognition, language ident...
We contribute 5-gram counts and language models trained on the Common Crawl corpus, a collection ove...
The quality of translations produced by statistical machine translation (SMT) systems crucially depe...
Language models are probability distributions over a set of unilingual natural language text used in...
Statistical machine translation, the task of translating text from one natural language into another...
Statistical machine translation, as well as other areas of human language processing, have recentl...
Research in speech recognition and machine translation is boosting the use of large scale n-gram lan...
This paper reports on the benefits of largescale statistical language modeling in machine translatio...
Machine translation is the discipline concerned with developing automated tools for translating fro...
This paper deals with the two fundamental problems concerning the handling of large n-gram language ...
2014-07-28The goal of machine translation is to translate from one natural language into another usi...
In recent years neural language models (LMs) have set state-of-the-art performance for several bench...
Approximate search algorithms, such as cube pruning in syntactic machine translation, rely on the la...
We contribute 5-gram counts and language models trained on the Common Crawl corpus, a collection ove...
Simple and Efficient Model Filtering in Statistical Machine Translation Data availability and distri...
Statistical n-gram language modeling is used in many domains like speech recognition, language ident...
We contribute 5-gram counts and language models trained on the Common Crawl corpus, a collection ove...
The quality of translations produced by statistical machine translation (SMT) systems crucially depe...
Language models are probability distributions over a set of unilingual natural language text used in...
Statistical machine translation, the task of translating text from one natural language into another...