Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing the vast datasets that are now rou-tinely produced. Relatively little attention has been paid to compress-ing the quality scores that are assigned to each sequence, even though these scores may be harder to compress than the sequences themselves. By aggregating a set of reads into a compressed index, we find that the majority of bases can be predicted from the sequence of bases that are adjacent to them and, hence, are likely to be less informative for variant calling or other applications. The quality scores for such bases are aggressively compressed, leaving a relatively small number at full resolution. As our approach relies directly on redu...
The increase in memory and in network traffic used and caused by new sequenced biological data has r...
Motivation: Genomic repositories are rapidly growing, as witnessed by the 1000 Genomes or the UK10K ...
With high throughput DNA sequencing costs dropping below $1000 for human genomes, data storage, retr...
Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing th...
Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing th...
Motivation: The past decade has seen the introduction of new technologies that significantly lowered...
Abstract—Recent advancements in sequencing tech-nology have led to a drastic reduction in the cost o...
High-throughput sequencing of DNA molecules has revolutionized biomedical research by enabling the q...
To the Editor: Most next-generation sequencing (NGS) quality scores are space intensive, redundant ...
Massive amounts of sequencing data are being generated thanks to advances in sequencing technology a...
Current NGS techniques are becoming exponentially cheaper. As a result, there is an exponential grow...
DNA sequencing is the process of determining the ordered sequence of the four nucleotide bases in a ...
Motivation: The growth of next-generation sequencing means that more effective and efficient archivi...
The intensive research interest in studying genomes has led to a series of advances in DNA sequencin...
Cox AJ, Bauer MJ, Jakobi T, Rosone G. Large-scale compression of genomic sequence databases with the...
The increase in memory and in network traffic used and caused by new sequenced biological data has r...
Motivation: Genomic repositories are rapidly growing, as witnessed by the 1000 Genomes or the UK10K ...
With high throughput DNA sequencing costs dropping below $1000 for human genomes, data storage, retr...
Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing th...
Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing th...
Motivation: The past decade has seen the introduction of new technologies that significantly lowered...
Abstract—Recent advancements in sequencing tech-nology have led to a drastic reduction in the cost o...
High-throughput sequencing of DNA molecules has revolutionized biomedical research by enabling the q...
To the Editor: Most next-generation sequencing (NGS) quality scores are space intensive, redundant ...
Massive amounts of sequencing data are being generated thanks to advances in sequencing technology a...
Current NGS techniques are becoming exponentially cheaper. As a result, there is an exponential grow...
DNA sequencing is the process of determining the ordered sequence of the four nucleotide bases in a ...
Motivation: The growth of next-generation sequencing means that more effective and efficient archivi...
The intensive research interest in studying genomes has led to a series of advances in DNA sequencin...
Cox AJ, Bauer MJ, Jakobi T, Rosone G. Large-scale compression of genomic sequence databases with the...
The increase in memory and in network traffic used and caused by new sequenced biological data has r...
Motivation: Genomic repositories are rapidly growing, as witnessed by the 1000 Genomes or the UK10K ...
With high throughput DNA sequencing costs dropping below $1000 for human genomes, data storage, retr...