Online persistent suffix tree construction has been con-sidered impractical due to its excessive I/O costs. However, these prior studies have not taken into account the effects of the buffer management policy and the internal node struc-ture of the suffix tree on I/O behavior of construction and subsequent retrievals over the tree. In this paper, we study these two issues in detail in the context of large genomic DNA and Protein sequences. In particular, we make the fol-lowing contributions: (i) a novel, low-overhead buffering policy called TOP-Q which improves the on-disk behavior of suffix tree construction and subsequent retrievals, and (ii) empirical evidence that the space efficient linked-list rep-resentation of suffix tree nodes prov...
With advances in sequencing technology and through ag-gressive sequencing efforts, DNA sequence data...
We propose a new method to build persistent suffix trees for indexing the genomic data. Our algorith...
The suffix tree is a data structure for indexing strings. It is used in a variety of applications su...
Online persistent suffix tree construction has been considered impractical due to its excessive I/O ...
Mammalian genomes are typically 3Gbps (gibabase pairs) in size. The largest public database NCBI (Na...
The suffix tree is a well known and popular indexing structure for various sequence processing probl...
A suffix tree is a fundamental data structure for string search-ing algorithms. Unfortunately, when ...
Sequence datasets are ubiquitous in modern life-science applications, and querying sequences is a co...
Abstract. Suffix trees have been established as one of the most versatile index structures for unstr...
In this study, we present an online suffix tree construction approach where Multiple sequences are i...
The construction of suffix tree for very long sequences is essential for many applications, and it p...
In recent years, bioinformatics becomes an important research field because there are more and more ...
The suffix tree (or equivalently, the enhanced suffix array) provides efficient solutions to many pr...
Abstract. Suffix-trees are popular indexing structures for various sequence processing problems in b...
Suffix-trees are popular indexing structures for various sequence processing problems in biological ...
With advances in sequencing technology and through ag-gressive sequencing efforts, DNA sequence data...
We propose a new method to build persistent suffix trees for indexing the genomic data. Our algorith...
The suffix tree is a data structure for indexing strings. It is used in a variety of applications su...
Online persistent suffix tree construction has been considered impractical due to its excessive I/O ...
Mammalian genomes are typically 3Gbps (gibabase pairs) in size. The largest public database NCBI (Na...
The suffix tree is a well known and popular indexing structure for various sequence processing probl...
A suffix tree is a fundamental data structure for string search-ing algorithms. Unfortunately, when ...
Sequence datasets are ubiquitous in modern life-science applications, and querying sequences is a co...
Abstract. Suffix trees have been established as one of the most versatile index structures for unstr...
In this study, we present an online suffix tree construction approach where Multiple sequences are i...
The construction of suffix tree for very long sequences is essential for many applications, and it p...
In recent years, bioinformatics becomes an important research field because there are more and more ...
The suffix tree (or equivalently, the enhanced suffix array) provides efficient solutions to many pr...
Abstract. Suffix-trees are popular indexing structures for various sequence processing problems in b...
Suffix-trees are popular indexing structures for various sequence processing problems in biological ...
With advances in sequencing technology and through ag-gressive sequencing efforts, DNA sequence data...
We propose a new method to build persistent suffix trees for indexing the genomic data. Our algorith...
The suffix tree is a data structure for indexing strings. It is used in a variety of applications su...