In this thesis, we will illustrate a two-level approach to compress and index string dictionaries, which are a crucial component of many software platforms for big data applications. The approach, which combines a succinctly-encoded Patricia Trie and some data compression techniques, is proposed with the aim of being simple but efficient in occupied space and query time, on par with state-of-the-art solutions. The performance of the approach is experimentally evaluated by considering several datasets and two state-of-the-art solutions: the Fast Succinct Trie and the Path Decomposed Trie. We will show that despite its simplicity, the proposed solution obtains space and time performance that are on the Pareto frontier of the best approaches, ...
.<F3.733e+05> As no database exists without indexes, no index implementation exists without or...
Current data structures for searching large string collec-tions are limited in that they either fail...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
The need to store and query a set of strings { a string dictionary { arises in many kinds of applica...
The need to store and query a set of strings – a string dictionary – arises in many kinds of applica...
Artículo de publicación ISIThe need to store and query a set of strings - a string dictionary - aris...
[Abstract] We introduce a new family of compressed data structures to efficiently store and query la...
Abstract. We present a technique to build an index based on sux arrays for compressed texts. We also...
The accessing and processing of textual information (i.e. the storing and querying of a set of strin...
String dictionaries constitute a large portion of the memory footprint of database applications. Whi...
Current data structures for searching large string collections either fail to achieve minimum space ...
An indexed sequence of strings is a data structure for storing a string sequence that supports rando...
Over the last decades, improvements in CPU speed have outpaced improvements in main memory and disk ...
This doctoral dissertation presents a range of results concerning efficient algorithms and data stru...
Tries are popular data structures for storing a set of strings, where common prefixes are represente...
.<F3.733e+05> As no database exists without indexes, no index implementation exists without or...
Current data structures for searching large string collec-tions are limited in that they either fail...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
The need to store and query a set of strings { a string dictionary { arises in many kinds of applica...
The need to store and query a set of strings – a string dictionary – arises in many kinds of applica...
Artículo de publicación ISIThe need to store and query a set of strings - a string dictionary - aris...
[Abstract] We introduce a new family of compressed data structures to efficiently store and query la...
Abstract. We present a technique to build an index based on sux arrays for compressed texts. We also...
The accessing and processing of textual information (i.e. the storing and querying of a set of strin...
String dictionaries constitute a large portion of the memory footprint of database applications. Whi...
Current data structures for searching large string collections either fail to achieve minimum space ...
An indexed sequence of strings is a data structure for storing a string sequence that supports rando...
Over the last decades, improvements in CPU speed have outpaced improvements in main memory and disk ...
This doctoral dissertation presents a range of results concerning efficient algorithms and data stru...
Tries are popular data structures for storing a set of strings, where common prefixes are represente...
.<F3.733e+05> As no database exists without indexes, no index implementation exists without or...
Current data structures for searching large string collec-tions are limited in that they either fail...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...