One of the most common ways to search a sequence database for sequences that are similar to a query sequence is to use a k-mer index such as BLAST. A big problem with k-mer indexes is the space required to store the lists of all occurrences of all k-mers in the database. One method for reducing the space needed, and also query time, is sampling where only some k-mer occurrences are stored. Most previous work uses hard sampling, in which enough k-mer occurrences are retained so that all similar sequences are guaranteed to be found. In contrast, we study soft sampling, which further reduces the number of stored k-mer occurrences at a cost of decreasing query accuracy. We focus on finding highly similar local alignments (HSLA) over nucleotide ...
Current biological sequence comparison tools utilize full database searches to find approximate matc...
Wolfsheimer S, Herms I, Rahmann S, Hartmann AK. Accurate statistics for local sequence alignment wit...
Current biological sequence comparison tools utilize full database searches to find approximate matc...
Bioinformatics applications and pipelines increasingly use k-mer indexes to search for similar seque...
Motivation: Recent experimental studies on compressed indexes (BWT, CSA, FM-index) have confirmed th...
Efficient and accurate search in biological sequence databases remains a matter of priority due to t...
MotivationDetection of maximal exact matches (MEMs) between two long sequences is a fundamental prob...
The computational power needed for searching exponentially growing databases, such as GenBank, has i...
Motivation: Recent experimental studies on compressed indexes (BWT, CSA, FM-index) have confirmed th...
Short-read aligners predominantly use the FM-index, which is easily able to index one or a few human...
Short-read aligners predominantly use the FM-index, which is easily able to index one or a few human...
While short read aligners, which predominantly use the FM-index, are able to easily index one or a f...
While short read aligners, which predominantly use the FM-index, are able to easily index one or a f...
The FM-index is a data structure used in genomics for exact search of input sequences over large ref...
Molecular biologists, geneticists, and other life scientists use the BLAST homology search package a...
Current biological sequence comparison tools utilize full database searches to find approximate matc...
Wolfsheimer S, Herms I, Rahmann S, Hartmann AK. Accurate statistics for local sequence alignment wit...
Current biological sequence comparison tools utilize full database searches to find approximate matc...
Bioinformatics applications and pipelines increasingly use k-mer indexes to search for similar seque...
Motivation: Recent experimental studies on compressed indexes (BWT, CSA, FM-index) have confirmed th...
Efficient and accurate search in biological sequence databases remains a matter of priority due to t...
MotivationDetection of maximal exact matches (MEMs) between two long sequences is a fundamental prob...
The computational power needed for searching exponentially growing databases, such as GenBank, has i...
Motivation: Recent experimental studies on compressed indexes (BWT, CSA, FM-index) have confirmed th...
Short-read aligners predominantly use the FM-index, which is easily able to index one or a few human...
Short-read aligners predominantly use the FM-index, which is easily able to index one or a few human...
While short read aligners, which predominantly use the FM-index, are able to easily index one or a f...
While short read aligners, which predominantly use the FM-index, are able to easily index one or a f...
The FM-index is a data structure used in genomics for exact search of input sequences over large ref...
Molecular biologists, geneticists, and other life scientists use the BLAST homology search package a...
Current biological sequence comparison tools utilize full database searches to find approximate matc...
Wolfsheimer S, Herms I, Rahmann S, Hartmann AK. Accurate statistics for local sequence alignment wit...
Current biological sequence comparison tools utilize full database searches to find approximate matc...