The construction of suffix tree for very long sequences is essential for many applications, and it plays a central role in the bioinformatic domain. With the advent of modern sequencing technologies, biological sequence databases have grown dramatically. Also the methodologies required to analyze these data have become everyday more complex, requiring fast queries to multiple genomes. In this paper we presented Parallel Continuous Flow PCF, a parallel suffix tree construction method that is suitable for very long strings. We tested our method on the construction of suffix tree of the entire human genome, about 3GB. We showed that PCF can scale gracefully as the size of the input string grows. Our method can work with an efficiency of 90% wi...
Suffix trees are one of the most versatile data structures in stringology, with many applications in...
Sequence alignment is one of the most important applications in computational biology, and is used f...
With advances in high-throughput sequencing methods, and the corresponding exponential growth in seq...
A suffix tree is a fundamental data structure for string search-ing algorithms. Unfortunately, when ...
With advances in sequencing technology and through ag-gressive sequencing efforts, DNA sequence data...
Mammalian genomes are typically 3Gbps (gibabase pairs) in size. The largest public database NCBI (Na...
ABSTRACT : Due to the advances of the so-called Next Generation Sequencing technologies (NGS), the a...
Abstract. Suffix arrays are a simple and powerful data structure for text processing that can be use...
The suffix tree is a data structure for indexing strings. It is used in a variety of applications su...
Online persistent suffix tree construction has been considered impractical due to its excessive I/O ...
Online persistent suffix tree construction has been con-sidered impractical due to its excessive I/O...
Abstract. Suffix trees have been established as one of the most versatile index structures for unstr...
AbstractWe present a new variant of the suffix tree called a distributed suffix tree (DST) which all...
Sequence datasets are ubiquitous in modern life-science applications, and querying sequences is a co...
In recent years, bioinformatics becomes an important research field because there are more and more ...
Suffix trees are one of the most versatile data structures in stringology, with many applications in...
Sequence alignment is one of the most important applications in computational biology, and is used f...
With advances in high-throughput sequencing methods, and the corresponding exponential growth in seq...
A suffix tree is a fundamental data structure for string search-ing algorithms. Unfortunately, when ...
With advances in sequencing technology and through ag-gressive sequencing efforts, DNA sequence data...
Mammalian genomes are typically 3Gbps (gibabase pairs) in size. The largest public database NCBI (Na...
ABSTRACT : Due to the advances of the so-called Next Generation Sequencing technologies (NGS), the a...
Abstract. Suffix arrays are a simple and powerful data structure for text processing that can be use...
The suffix tree is a data structure for indexing strings. It is used in a variety of applications su...
Online persistent suffix tree construction has been considered impractical due to its excessive I/O ...
Online persistent suffix tree construction has been con-sidered impractical due to its excessive I/O...
Abstract. Suffix trees have been established as one of the most versatile index structures for unstr...
AbstractWe present a new variant of the suffix tree called a distributed suffix tree (DST) which all...
Sequence datasets are ubiquitous in modern life-science applications, and querying sequences is a co...
In recent years, bioinformatics becomes an important research field because there are more and more ...
Suffix trees are one of the most versatile data structures in stringology, with many applications in...
Sequence alignment is one of the most important applications in computational biology, and is used f...
With advances in high-throughput sequencing methods, and the corresponding exponential growth in seq...