International audienceThe development of next-generation sequencing (NGS) technology presents a considerable challenge for data storage. To address this challenge, a number of compression algorithms have been developed. However, currently used algorithms fail to simultaneously achieve high compression ratio as well as high compression speed. We propose an algorithm STrieGD that is based on a trie index structure for improving the compression speed of FASTQ files. To reduce the size of the trie index structure, our approach adopts a sampling strategy followed by a filtering step using quality scores. Our experiment shows that the compression ratio of our algorithm increased by approx. 50% over GZip, while being nearly equal to that of DSRC. ...
Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing th...
Over the past few years the amount of digital memory and network traffic used by sequenced biologica...
Motivation: Storing, transferring, and maintaining genomic databa-ses becomes a major challenge beca...
The intensive research interest in studying genomes has led to a series of advances in DNA sequencin...
Next generation sequencing (NGS) technologies have gained considerable popularity among biologists. ...
A modern sequencing instrument is able to generate hundreds of millions of short reads of genomic da...
DNA sequencing is the process of determining the ordered sequence of the four nucleotide bases in a ...
The increase in memory and in network traffic used and caused by new sequenced biological data has r...
With high throughput DNA sequencing costs dropping below $1000 for human genomes, data storage, retr...
The exponential growth of high-throughput DNA sequence data has posed great challenges to genomic da...
Motivation: The past decade has seen the introduction of new technologies that significantly lowered...
Storage and transmission of the data produced by modern DNA sequencing instruments has become a majo...
BackgroundHigh-throughput sequencing (HTS) technologies play important roles in the life sciences by...
BackgroundThe massive quantities of genetic data generated by high-throughput sequencing pose challe...
Background: As Next-Generation Sequencing data becomes available, existing hardware environments do ...
Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing th...
Over the past few years the amount of digital memory and network traffic used by sequenced biologica...
Motivation: Storing, transferring, and maintaining genomic databa-ses becomes a major challenge beca...
The intensive research interest in studying genomes has led to a series of advances in DNA sequencin...
Next generation sequencing (NGS) technologies have gained considerable popularity among biologists. ...
A modern sequencing instrument is able to generate hundreds of millions of short reads of genomic da...
DNA sequencing is the process of determining the ordered sequence of the four nucleotide bases in a ...
The increase in memory and in network traffic used and caused by new sequenced biological data has r...
With high throughput DNA sequencing costs dropping below $1000 for human genomes, data storage, retr...
The exponential growth of high-throughput DNA sequence data has posed great challenges to genomic da...
Motivation: The past decade has seen the introduction of new technologies that significantly lowered...
Storage and transmission of the data produced by modern DNA sequencing instruments has become a majo...
BackgroundHigh-throughput sequencing (HTS) technologies play important roles in the life sciences by...
BackgroundThe massive quantities of genetic data generated by high-throughput sequencing pose challe...
Background: As Next-Generation Sequencing data becomes available, existing hardware environments do ...
Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing th...
Over the past few years the amount of digital memory and network traffic used by sequenced biologica...
Motivation: Storing, transferring, and maintaining genomic databa-ses becomes a major challenge beca...