First Huffman, then Burrows-Wheeler: A Simple Alphabet-Independent FM-Index

Navarro, Gonzalo
Grabowski, Szymon
Mäkinen, Veli

Publication date

January 2014

Abstract

We design a succinct full-text index based on the idea of Huffman-compressing the text and then applying the Burrows-Wheeler transform over it. The resulting structure can be searched as an FM-index, with the benefit of removing the sharp dependence on the alphabet size, , present in that structure. On a text of length n with zero-order entropy H0, our index needs O(n(H0+1)) bits of space, without any dependence on . The average search time for a pattern of length m is O(m(H0 + 1)), under reasonable assumptions. Each position of a text occurrence can be reported in worst case time O((H0 + 1) log n), while any text substring of length L can be retrieved in O((H0 + 1)L) average time in addition to the previous worst case time. By pay...

Extracted data

We use cookies to provide a better user experience.

Data Protection

First Huffman, then Burrows-Wheeler: A Simple Alphabet-Independent FM-Index

Abstract

Extracted data

First Huffman, then Burrows-Wheeler: A Simple Alphabet-Independent FM-Index

Abstract

Extracted data

Related items

Related items