Mammalian genomes are typically 3Gbps (gibabase pairs) in size. The largest public database NCBI (National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov)) of DNA contains more than 20 Gbps. Suffix trees are widely acknowledged as a data structure to support exact/approximate sequence matching queries as well as repetitive structure finding efficiently when they can reside in main memory. But, it has been shown as difficult to handle long DNA sequences using suffix trees due to the so-called memory bottleneck problems. The most space efficient main-memory suffix tree construction algorithm takes nine hours and 45 GB memory space to index the human genome [ 19]. In this paper, we show that suffix trees for long DNA sequenc...
We propose a new method to build persistent suffix trees for indexing the genomic data. Our algorith...
Sequence data is one of the rapidly growing types of data. New efficient and scalable techniques are...
The suffix tree (or equivalently, the enhanced suffix array) provides efficient solutions to many pr...
A suffix tree is a fundamental data structure for string search-ing algorithms. Unfortunately, when ...
Sequence datasets are ubiquitous in modern life-science applications, and querying sequences is a co...
The construction of suffix tree for very long sequences is essential for many applications, and it p...
Online persistent suffix tree construction has been considered impractical due to its excessive I/O ...
Online persistent suffix tree construction has been con-sidered impractical due to its excessive I/O...
With advances in sequencing technology and through ag-gressive sequencing efforts, DNA sequence data...
Abstract. Suffix trees have been established as one of the most versatile index structures for unstr...
The suffix tree is a well known and popular indexing structure for various sequence processing probl...
Abstract. Our aim is to develop new database technologies for the approximate matching of unstructur...
In recent years, bioinformatics becomes an important research field because there are more and more ...
The suffix tree is a data structure for indexing strings. It is used in a variety of applications su...
Over the last decade, biological sequence repositories have been growing at an exponential rate. Sop...
We propose a new method to build persistent suffix trees for indexing the genomic data. Our algorith...
Sequence data is one of the rapidly growing types of data. New efficient and scalable techniques are...
The suffix tree (or equivalently, the enhanced suffix array) provides efficient solutions to many pr...
A suffix tree is a fundamental data structure for string search-ing algorithms. Unfortunately, when ...
Sequence datasets are ubiquitous in modern life-science applications, and querying sequences is a co...
The construction of suffix tree for very long sequences is essential for many applications, and it p...
Online persistent suffix tree construction has been considered impractical due to its excessive I/O ...
Online persistent suffix tree construction has been con-sidered impractical due to its excessive I/O...
With advances in sequencing technology and through ag-gressive sequencing efforts, DNA sequence data...
Abstract. Suffix trees have been established as one of the most versatile index structures for unstr...
The suffix tree is a well known and popular indexing structure for various sequence processing probl...
Abstract. Our aim is to develop new database technologies for the approximate matching of unstructur...
In recent years, bioinformatics becomes an important research field because there are more and more ...
The suffix tree is a data structure for indexing strings. It is used in a variety of applications su...
Over the last decade, biological sequence repositories have been growing at an exponential rate. Sop...
We propose a new method to build persistent suffix trees for indexing the genomic data. Our algorith...
Sequence data is one of the rapidly growing types of data. New efficient and scalable techniques are...
The suffix tree (or equivalently, the enhanced suffix array) provides efficient solutions to many pr...