Suffix trees are data structures that can be used to index a corpus. In this paper, we explore how some properties of suffix trees naturally provide the functionality of an n-gram language model with variable n. We explain these properties of suffix trees, which we leverage for our Suffix Tree Language Model (STLM) implementation and explain how a suffix tree implicitly contains the data needed for n-gram language modeling. We also discuss the kinds of smoothing techniques appropriate to such a model. We then show that our suffix-tree language model implementation is competitive when compared to the state-of-the-art language model SRILM (Stolke, 2002) in statistical machine translation experiments
We introduce a data structure, the Bundled Suffix Tree (BUST), that is a generalization of a Suffix ...
International audienceThe sparse suffix trees are suffix trees in which some suffixes are omitted. T...
International audienceThe sparse suffix trees are suffix trees in which some suffixes are omitted. T...
Suffix trees are data structures that can be used to index a corpus. In this paper, we explore how s...
Kennington C, Kay M, Friedrich A. Suffix Trees as Language Models. In: Calzolari (Conference Chair) ...
With more and more natural language text stored in databases, handling respective query predicates b...
In the present work we study suffix tree construction algorithms. This structure helps solving a var...
In recent years highly compact succinct text indexes developed in bioinformatics have spread to the ...
With more and more text data stored in databases, the problem of handling natural language query pre...
Efficient methods for storing and querying are critical for scaling high-order m-gram language model...
Suffix trees and suffix arrays are fundamental full-text index data struc-tures to solve problems oc...
We introduce a data structure, the Bundled Suffix Tree (BUST), that is a generalization of a Suffix ...
An approach that incorporates WordNet features to an n-gram language modeler has been developed in t...
An approach that incorporates WordNet features to an n-gram language modeler has been developed in t...
Abstract. Suffix trees are the key data structure for text string matching, and are used in wide app...
We introduce a data structure, the Bundled Suffix Tree (BUST), that is a generalization of a Suffix ...
International audienceThe sparse suffix trees are suffix trees in which some suffixes are omitted. T...
International audienceThe sparse suffix trees are suffix trees in which some suffixes are omitted. T...
Suffix trees are data structures that can be used to index a corpus. In this paper, we explore how s...
Kennington C, Kay M, Friedrich A. Suffix Trees as Language Models. In: Calzolari (Conference Chair) ...
With more and more natural language text stored in databases, handling respective query predicates b...
In the present work we study suffix tree construction algorithms. This structure helps solving a var...
In recent years highly compact succinct text indexes developed in bioinformatics have spread to the ...
With more and more text data stored in databases, the problem of handling natural language query pre...
Efficient methods for storing and querying are critical for scaling high-order m-gram language model...
Suffix trees and suffix arrays are fundamental full-text index data struc-tures to solve problems oc...
We introduce a data structure, the Bundled Suffix Tree (BUST), that is a generalization of a Suffix ...
An approach that incorporates WordNet features to an n-gram language modeler has been developed in t...
An approach that incorporates WordNet features to an n-gram language modeler has been developed in t...
Abstract. Suffix trees are the key data structure for text string matching, and are used in wide app...
We introduce a data structure, the Bundled Suffix Tree (BUST), that is a generalization of a Suffix ...
International audienceThe sparse suffix trees are suffix trees in which some suffixes are omitted. T...
International audienceThe sparse suffix trees are suffix trees in which some suffixes are omitted. T...