We design a succinct full-text index based on the idea of Huffman-compressing the text and then applying the Burrows-Wheeler transform over it. The resulting structure can be searched as an FM-index, with the benefit of removing the sharp dependence on the alphabet size, , present in that structure. On a text of length n with zero-order entropy H0, our index needs O(n(H0+1)) bits of space, without any dependence on . The average search time for a pattern of length m is O(m(H0 + 1)), under reasonable assumptions. Each position of a text occurrence can be reported in worst case time O((H0 + 1) log n), while any text substring of length L can be retrieved in O((H0 + 1)L) average time in addition to the previous worst case time. By pay...
The size of electronic data is currently growing at a faster rate than computer memory and disk stor...
[[abstract]]Recent research in compressing suffix arrays has resulted in two breakthrough indexing d...
The LZ-index is a compressed full-text self-index able to represent a text T-1...u, over an alphabet...
We design a succinct full-text index based on the idea of Huffman-compressing the text and then app...
We design a succinct full-text index based on the idea of Huffman-compressing the text and then appl...
The FM-index is a succinct text index needing only O(Hkn) bits of space, where n is the text size an...
Abstract. In an earlier work [6] we presented a simple FM-index variant, based on the idea of Huffma...
We show that, by combining an existing compression boosting technique with the wavelet tree data str...
We design two compressed data structures for the full-text indexing problem that support efficient s...
In this paper we design two compressed data structures for the full-text indexing problem. These da...
Given a string X of length n on alphabet , the FM-index data structure allows counting all occurrenc...
Indexing highly repetitive texts - such as genomic databases, software repositories and versioned te...
[[abstract]]A new trend in the field of pattern matching is to design indexing data structures which...
Abstract This paper is about compressed full-text indexes. That is, our goal is to represent full-te...
A succinct full-text self-index is a data structure built on a text T = t1t2...tn, which takes littl...
The size of electronic data is currently growing at a faster rate than computer memory and disk stor...
[[abstract]]Recent research in compressing suffix arrays has resulted in two breakthrough indexing d...
The LZ-index is a compressed full-text self-index able to represent a text T-1...u, over an alphabet...
We design a succinct full-text index based on the idea of Huffman-compressing the text and then app...
We design a succinct full-text index based on the idea of Huffman-compressing the text and then appl...
The FM-index is a succinct text index needing only O(Hkn) bits of space, where n is the text size an...
Abstract. In an earlier work [6] we presented a simple FM-index variant, based on the idea of Huffma...
We show that, by combining an existing compression boosting technique with the wavelet tree data str...
We design two compressed data structures for the full-text indexing problem that support efficient s...
In this paper we design two compressed data structures for the full-text indexing problem. These da...
Given a string X of length n on alphabet , the FM-index data structure allows counting all occurrenc...
Indexing highly repetitive texts - such as genomic databases, software repositories and versioned te...
[[abstract]]A new trend in the field of pattern matching is to design indexing data structures which...
Abstract This paper is about compressed full-text indexes. That is, our goal is to represent full-te...
A succinct full-text self-index is a data structure built on a text T = t1t2...tn, which takes littl...
The size of electronic data is currently growing at a faster rate than computer memory and disk stor...
[[abstract]]Recent research in compressing suffix arrays has resulted in two breakthrough indexing d...
The LZ-index is a compressed full-text self-index able to represent a text T-1...u, over an alphabet...