[[abstract]]Searching patterns in the DNA sequence is an important step in biological research. To speed up the search process, one can index the DNA sequence. However, classical indexing data structures like suffix trees and suffix arrays are not feasible for indexing DNA sequences due to main memory requirement, as DNA sequences can be very long. In this paper, we evaluate the performance of two compressed data structures, Compressed Suffix Array (CSA) and FM-index, in the context of searching and indexing DNA sequences. Our results show that CSA is better than FM-index for searching long patterns. We also investigate other practical aspects of the data structures such as the memory requirement for building the indexes.[[fileno]]203024503...
Recent advances in the asymptotic resource costs of pattern matching with compressed suffix arrays a...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
Searching patterns in the DNA sequence is an important step in biological research. To speed up the ...
Proceedings of the Sixth Workshop on Algorithm Engineering and Experiments and the First Workshop on...
Sequence data is one of the rapidly growing types of data. New efficient and scalable techniques are...
[[abstract]]Recent research in compressing suffix arrays has resulted in two breakthrough indexing d...
With the first human DNA being decoded into a sequence of about 2.8 billion characters, much biologi...
Abstract: In this paper, we develop a simple and practical storage scheme for compressed suffix arra...
Abstract. Our aim is to develop new database technologies for the approximate matching of unstructur...
We investigate the problem of building full-text substring indexes for inputs significantly larger t...
The amount of available biological sequences, represented as strings over the DNA and protein alphab...
In order to facilitate and speed up the search of massive DNA databases, the database is indexed at ...
Abstract. Self-indexes are largely studied and widely applied structures in string matching. However...
Recent advances in the asymptotic resource costs of pattern matching with compressed suffix arrays a...
Recent advances in the asymptotic resource costs of pattern matching with compressed suffix arrays a...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
Searching patterns in the DNA sequence is an important step in biological research. To speed up the ...
Proceedings of the Sixth Workshop on Algorithm Engineering and Experiments and the First Workshop on...
Sequence data is one of the rapidly growing types of data. New efficient and scalable techniques are...
[[abstract]]Recent research in compressing suffix arrays has resulted in two breakthrough indexing d...
With the first human DNA being decoded into a sequence of about 2.8 billion characters, much biologi...
Abstract: In this paper, we develop a simple and practical storage scheme for compressed suffix arra...
Abstract. Our aim is to develop new database technologies for the approximate matching of unstructur...
We investigate the problem of building full-text substring indexes for inputs significantly larger t...
The amount of available biological sequences, represented as strings over the DNA and protein alphab...
In order to facilitate and speed up the search of massive DNA databases, the database is indexed at ...
Abstract. Self-indexes are largely studied and widely applied structures in string matching. However...
Recent advances in the asymptotic resource costs of pattern matching with compressed suffix arrays a...
Recent advances in the asymptotic resource costs of pattern matching with compressed suffix arrays a...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...