In the context of confusible disambiguation (spelling correction that requires context), the synchronous back-off strategy combined with traditional n-gram language models performs well. However, when alternatives consist of a different number of tokens, this classification technique cannot be applied directly, because the computation of the probabilities is skewed. Previous work already showed that probabilities based on different order n-grams should not be compared directly. In this article, we propose new probability metrics in which the size of the n is varied according to the number of tokens of the confusible alternative. This requires access to n-grams of variable length. Results show that the synchronous back-off method is extremely...
© 2015 IEEE. Compounding is one of the most productive word formation processes in many languages an...
Conventional n-gram language models are well-established as powerful yet simple mechanisms for chara...
How cross-linguistically applicable are NLP models, specifically language models? A fair comparison ...
In the context of confusible disambiguation (spelling correction that requires context), the synchro...
In this article, we propose the use of suffix arrays to implement n-gram language models with practic...
The problem of identifying and correcting confusibles, i.e. context-sensitive spelling errors, in te...
In this paper, an extension of n-grams is proposed. In this extension, the memory of the model (n) i...
Language models typically tokenize text into subwords, using a deterministic, hand-engineered heuris...
Verwimp L., Pelemans J., Van hamme H., Wambacq P., ''Extending n-gram language models based on equiv...
In this paper, an extension of n-grams, called x-grams, is proposed. In this extension, the memory o...
The proper detection of tokens in of running text represents the initial processing step in modular ...
The proper detection of tokens in of running text represents the initial processing step in modular ...
Building models of language is a central task in natural language processing. Traditionally, languag...
© 2017 IEEE. The work presents the task of spelling correction realized in a batch mode with support...
Natural Language is highly ambiguous, on every level. This article describes a fast broad-coverage s...
© 2015 IEEE. Compounding is one of the most productive word formation processes in many languages an...
Conventional n-gram language models are well-established as powerful yet simple mechanisms for chara...
How cross-linguistically applicable are NLP models, specifically language models? A fair comparison ...
In the context of confusible disambiguation (spelling correction that requires context), the synchro...
In this article, we propose the use of suffix arrays to implement n-gram language models with practic...
The problem of identifying and correcting confusibles, i.e. context-sensitive spelling errors, in te...
In this paper, an extension of n-grams is proposed. In this extension, the memory of the model (n) i...
Language models typically tokenize text into subwords, using a deterministic, hand-engineered heuris...
Verwimp L., Pelemans J., Van hamme H., Wambacq P., ''Extending n-gram language models based on equiv...
In this paper, an extension of n-grams, called x-grams, is proposed. In this extension, the memory o...
The proper detection of tokens in of running text represents the initial processing step in modular ...
The proper detection of tokens in of running text represents the initial processing step in modular ...
Building models of language is a central task in natural language processing. Traditionally, languag...
© 2017 IEEE. The work presents the task of spelling correction realized in a batch mode with support...
Natural Language is highly ambiguous, on every level. This article describes a fast broad-coverage s...
© 2015 IEEE. Compounding is one of the most productive word formation processes in many languages an...
Conventional n-gram language models are well-established as powerful yet simple mechanisms for chara...
How cross-linguistically applicable are NLP models, specifically language models? A fair comparison ...