International audienceThe ubiquity of next generation sequencing has transformed the size and nature of many databases, pushing the boundaries of current indexing and searching methods. One particular example is a database of 2,652 human RNA-seq experiments uploaded to the Sequence Read Archive. Recently, Solomon and Kingsford proposed the Sequence Bloom Tree data structure and demonstrated how it can be used to accurately identify SRA samples that have a transcript of interest potentially expressed. In this paper, we propose an improvement called the AllSome Sequence Bloom Tree. Results show that our new data structure significantly improves performance, reducing the tree construction time by 52.7% and query time by 39–85%, with a price of...
The amount of available biological sequences, represented as strings over the DNA and protein alphab...
Abstract Background Since the introduction of next-generation DNA sequencers the rapid increase in s...
A repetitive sequence collection is one where portions of a base sequence of length n are repeated m...
High-throughput RNA sequencing (RNA-seq) is primarily used in measuring gene expression, quantifying...
In the biological sciences, sequence analysis refers to analytical investigations that use nucleic a...
High-throughput sequencing has helped to transform our study of biological organisms and processes. ...
High-throughput sequencing technology, also called next-generation sequencing (NGS), has the potenti...
High-throughput sequencing technology, also called next-generation sequencing (NGS), has the potenti...
With advances in sequencing technology and through ag-gressive sequencing efforts, DNA sequence data...
International audienceLarge corpura of texts or of sequences serve as references and are interrogate...
Motivation The ever-growing size of sequencing data is a major bottleneck in bioinformatics as th...
Searching for repetitive structures in DNA sequences is a major problem in bioinformatics research. ...
We present SeqOthello, an ultra-fast and memory-efficient indexing structure to support arbitrary se...
Technological advances made over the last decades in sequencing technologies have led to continuous ...
Capillary Electrophoresis (CE) based on Sanger sequencing has given the ability to extract and expla...
The amount of available biological sequences, represented as strings over the DNA and protein alphab...
Abstract Background Since the introduction of next-generation DNA sequencers the rapid increase in s...
A repetitive sequence collection is one where portions of a base sequence of length n are repeated m...
High-throughput RNA sequencing (RNA-seq) is primarily used in measuring gene expression, quantifying...
In the biological sciences, sequence analysis refers to analytical investigations that use nucleic a...
High-throughput sequencing has helped to transform our study of biological organisms and processes. ...
High-throughput sequencing technology, also called next-generation sequencing (NGS), has the potenti...
High-throughput sequencing technology, also called next-generation sequencing (NGS), has the potenti...
With advances in sequencing technology and through ag-gressive sequencing efforts, DNA sequence data...
International audienceLarge corpura of texts or of sequences serve as references and are interrogate...
Motivation The ever-growing size of sequencing data is a major bottleneck in bioinformatics as th...
Searching for repetitive structures in DNA sequences is a major problem in bioinformatics research. ...
We present SeqOthello, an ultra-fast and memory-efficient indexing structure to support arbitrary se...
Technological advances made over the last decades in sequencing technologies have led to continuous ...
Capillary Electrophoresis (CE) based on Sanger sequencing has given the ability to extract and expla...
The amount of available biological sequences, represented as strings over the DNA and protein alphab...
Abstract Background Since the introduction of next-generation DNA sequencers the rapid increase in s...
A repetitive sequence collection is one where portions of a base sequence of length n are repeated m...