This paper introduces a novel method, called Reference-Based String Alignment (RBSA), that speeds up retrieval of optimal subsequence matches in large databases of sequences under the edit distance and the Smith-Waterman similarity measure. RBSA operates using the assumption that the optimal match deviates by a relatively small amount from the query, an amount that does not exceed a prespecified fraction of the query length. RBSA has an exact version that guarantees no false dismissals and can handle large queries efficiently. An approximate version of RBSA is also described, that achieves significant additional improvements over the exact version, with negligible losses in retrieval accuracy. RBSA performs filtering of candidate matches us...
Biology researchers have a pressing need for data management technologies which will make the storag...
Accurate alignments of sequences are needed for many types of analyses. Aligned sequences might be t...
Efficient and accurate search in biological sequence databases remains a matter of priority due to t...
Abstract. Sequence alignment is an important task for molecular biolo-gists. Because alignment basic...
We consider the problem of similarity search in a very large sequence database with edit distance as...
We study the problem of local alignment, which is finding pairs of similar subsequences with gaps. T...
In this paper we present an algorithm which attempts to align pairs of subsequences from a database ...
Local Sequence Algignment. The local sequence alignment problem is de-fined as follows: Given two st...
Sequence alignment is an important bioinformatics tool for identifying homology, but searching again...
This thesis presents an application of a generalized suffix tree extended by the use of frequency of...
We consider the problem of aligning two very long biological sequences. The score for the best align...
Biological pairwise sequence alignment can be used as a method for arranging two biological sequence...
High throughput sequencing is without a doubt one of the most influential technological advances in ...
Motivation: Recent experimental studies on compressed indexes (BWT, CSA, FM-index) have confirmed th...
Abstract. In the last twenty years, protein databases have been growing exponentially. To speed up t...
Biology researchers have a pressing need for data management technologies which will make the storag...
Accurate alignments of sequences are needed for many types of analyses. Aligned sequences might be t...
Efficient and accurate search in biological sequence databases remains a matter of priority due to t...
Abstract. Sequence alignment is an important task for molecular biolo-gists. Because alignment basic...
We consider the problem of similarity search in a very large sequence database with edit distance as...
We study the problem of local alignment, which is finding pairs of similar subsequences with gaps. T...
In this paper we present an algorithm which attempts to align pairs of subsequences from a database ...
Local Sequence Algignment. The local sequence alignment problem is de-fined as follows: Given two st...
Sequence alignment is an important bioinformatics tool for identifying homology, but searching again...
This thesis presents an application of a generalized suffix tree extended by the use of frequency of...
We consider the problem of aligning two very long biological sequences. The score for the best align...
Biological pairwise sequence alignment can be used as a method for arranging two biological sequence...
High throughput sequencing is without a doubt one of the most influential technological advances in ...
Motivation: Recent experimental studies on compressed indexes (BWT, CSA, FM-index) have confirmed th...
Abstract. In the last twenty years, protein databases have been growing exponentially. To speed up t...
Biology researchers have a pressing need for data management technologies which will make the storag...
Accurate alignments of sequences are needed for many types of analyses. Aligned sequences might be t...
Efficient and accurate search in biological sequence databases remains a matter of priority due to t...