Current data structures for searching large string collections either fail to achieve minimum space or cause too many cache misses. In this paper we discuss some edge linearizations of the classic trie data structure that are simultaneously cache-friendly and compressed. We provide new insights on front coding [24], introduce other novel linearizations, and study how close their space occupancy is to the information-theoretic minimum. The moral is that they are not just heuristics. Our second contribution is a novel dictionary encoding scheme that builds upon such linearizations and achieves nearly optimal space, offers competitive I/O-search time, and is also conscious of the query distribution. Finally, we combine those data structures wi...
In this thesis, we will illustrate a two-level approach to compress and index string dictionaries, w...
We propose measures for compressed data structures, in which space usage is mea- sured in a data-awa...
In this paper, we present cache-efficient algorithms for trie search. There are three key features o...
Current data structures for searching large string collections either fail to achieve minimum space ...
Current data structures for searching large string collec-tions are limited in that they either fail...
In this article, we study three variants of the well-known prefix-search problem for strings, and we...
Tries are popular data structures for storing a set of strings, where common prefixes are represente...
B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees p...
B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees p...
The need to store and query a set of strings { a string dictionary { arises in many kinds of applica...
The need to store and query a set of strings – a string dictionary – arises in many kinds of applica...
This thesis revisits two fundamental problems in data structure design: predecessor search and rank/...
AbstractIn this paper, we propose measures for compressed data structures, in which space usage is m...
We propose measures for compressed data structures, in which space usage is measured in a data-aware...
[[abstract]]The past few years have witnessed several exciting results on compressed representation ...
In this thesis, we will illustrate a two-level approach to compress and index string dictionaries, w...
We propose measures for compressed data structures, in which space usage is mea- sured in a data-awa...
In this paper, we present cache-efficient algorithms for trie search. There are three key features o...
Current data structures for searching large string collections either fail to achieve minimum space ...
Current data structures for searching large string collec-tions are limited in that they either fail...
In this article, we study three variants of the well-known prefix-search problem for strings, and we...
Tries are popular data structures for storing a set of strings, where common prefixes are represente...
B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees p...
B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees p...
The need to store and query a set of strings { a string dictionary { arises in many kinds of applica...
The need to store and query a set of strings – a string dictionary – arises in many kinds of applica...
This thesis revisits two fundamental problems in data structure design: predecessor search and rank/...
AbstractIn this paper, we propose measures for compressed data structures, in which space usage is m...
We propose measures for compressed data structures, in which space usage is measured in a data-aware...
[[abstract]]The past few years have witnessed several exciting results on compressed representation ...
In this thesis, we will illustrate a two-level approach to compress and index string dictionaries, w...
We propose measures for compressed data structures, in which space usage is mea- sured in a data-awa...
In this paper, we present cache-efficient algorithms for trie search. There are three key features o...