The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very large string dictionaries. Typical data structures for textual indexing are Hash Tables and some variants of Tries such as the Double Trie (DT). In this paper, we propose an extension of the DT that we have called MergedTrie. It improves the DT compression by merging both Tries into a single and by segmenting the indexed term into two fixed length parts in order to balance the new Trie. Thus, a higher overlapping of both prefixes and suffixes is...
Abstract This paper is about compressed full-text indexes. That is, our goal is to represent full-te...
The rise of repetitive datasets has lately generated a lot of interest in compressed self-indexes ba...
This paper deals with the two fundamental problems concerning the handling of large n-gram language ...
The accessing and processing of textual information (i.e. the storing and querying of a set of strin...
This thesis deals with space-efficient algorithms to compress and index texts. The aim of compressio...
The sheer increase in volume of RDF data demands efficient solutions for the triple indexing problem...
The sheer increase in volume of RDF data demands efficient solutions for the triple indexing problem...
This thesis presents three trie organizations for various binary tries. The new trie structures have...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
Artículo de publicación ISIThe need to store and query a set of strings - a string dictionary - aris...
AbstractLet a text of u characters over an alphabet of size σ be compressible to n phrases by the LZ...
In this thesis, we will illustrate a two-level approach to compress and index string dictionaries, w...
Two fundamental problems concern the handling of large n-gram language models: indexing, that is, co...
We propose algorithms that, given the input string of length n over integer alphabet of size σ, cons...
This thesis focuses on the design of succinct and compressed data structures for collections of stri...
Abstract This paper is about compressed full-text indexes. That is, our goal is to represent full-te...
The rise of repetitive datasets has lately generated a lot of interest in compressed self-indexes ba...
This paper deals with the two fundamental problems concerning the handling of large n-gram language ...
The accessing and processing of textual information (i.e. the storing and querying of a set of strin...
This thesis deals with space-efficient algorithms to compress and index texts. The aim of compressio...
The sheer increase in volume of RDF data demands efficient solutions for the triple indexing problem...
The sheer increase in volume of RDF data demands efficient solutions for the triple indexing problem...
This thesis presents three trie organizations for various binary tries. The new trie structures have...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
Artículo de publicación ISIThe need to store and query a set of strings - a string dictionary - aris...
AbstractLet a text of u characters over an alphabet of size σ be compressible to n phrases by the LZ...
In this thesis, we will illustrate a two-level approach to compress and index string dictionaries, w...
Two fundamental problems concern the handling of large n-gram language models: indexing, that is, co...
We propose algorithms that, given the input string of length n over integer alphabet of size σ, cons...
This thesis focuses on the design of succinct and compressed data structures for collections of stri...
Abstract This paper is about compressed full-text indexes. That is, our goal is to represent full-te...
The rise of repetitive datasets has lately generated a lot of interest in compressed self-indexes ba...
This paper deals with the two fundamental problems concerning the handling of large n-gram language ...