[Abstract] We introduce a new family of compressed data structures to efficiently store and query large string dictionaries in main memory. Our main technique is a combination of hierarchical Front-coding with ideas from longest-common-prefix computation in suffix arrays. Our data structures yield relevant space-time tradeoffs in real-world dictionaries. We focus on two domains where string dictionaries are extensively used and efficient compression is required: URL collections, a key element in Web graphs and applications such as Web mining; and collections of URIs and literals, the basic components of RDF datasets. Our experiments show that our data structures achieve better compression than the state-of-the-art alternatives while providi...
Relative Lempel-Ziv (RLZ) parsing is a dictionary compression method in which a string S is compress...
[[abstract]]In this paper, we present an experimental study of the space-time tradeoffs for the dict...
[[abstract]]Recent research in compressing suffix arrays has resulted in two breakthrough indexing d...
Artículo de publicación ISIThe need to store and query a set of strings - a string dictionary - aris...
The need to store and query a set of strings { a string dictionary { arises in many kinds of applica...
The need to store and query a set of strings – a string dictionary – arises in many kinds of applica...
In this thesis, we will illustrate a two-level approach to compress and index string dictionaries, w...
String dictionaries constitute a large portion of the memory footprint of database applications. Whi...
Domain encoding is a common technique to compress the columns of a column store and to accelerate ma...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
Web crawls generate vast quantities of text, retained and archived by the search services that initi...
We design two compressed data structures for the full-text indexing problem that support efficient s...
[[abstract]]We propose measures for compressed data structures, in which space usage is measured in ...
AbstractIn this paper, we propose measures for compressed data structures, in which space usage is m...
This doctoral dissertation presents a range of results concerning efficient algorithms and data stru...
Relative Lempel-Ziv (RLZ) parsing is a dictionary compression method in which a string S is compress...
[[abstract]]In this paper, we present an experimental study of the space-time tradeoffs for the dict...
[[abstract]]Recent research in compressing suffix arrays has resulted in two breakthrough indexing d...
Artículo de publicación ISIThe need to store and query a set of strings - a string dictionary - aris...
The need to store and query a set of strings { a string dictionary { arises in many kinds of applica...
The need to store and query a set of strings – a string dictionary – arises in many kinds of applica...
In this thesis, we will illustrate a two-level approach to compress and index string dictionaries, w...
String dictionaries constitute a large portion of the memory footprint of database applications. Whi...
Domain encoding is a common technique to compress the columns of a column store and to accelerate ma...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
Web crawls generate vast quantities of text, retained and archived by the search services that initi...
We design two compressed data structures for the full-text indexing problem that support efficient s...
[[abstract]]We propose measures for compressed data structures, in which space usage is measured in ...
AbstractIn this paper, we propose measures for compressed data structures, in which space usage is m...
This doctoral dissertation presents a range of results concerning efficient algorithms and data stru...
Relative Lempel-Ziv (RLZ) parsing is a dictionary compression method in which a string S is compress...
[[abstract]]In this paper, we present an experimental study of the space-time tradeoffs for the dict...
[[abstract]]Recent research in compressing suffix arrays has resulted in two breakthrough indexing d...