International audienceData volumes generated by next-generation sequencing (NGS) technologies is now a major concern for both data storage and transmission. This triggered the need for more efficient methods than general purpose compression tools, such as the widely used gzip method.We present a novel reference-free method meant to compress data issued from high throughput sequencing technologies. Our approach, implemented in the software LEON, employs techniques derived from existing assembly principles. The method is based on a reference probabilistic de Bruijn Graph, built de novo from the set of reads and stored in a Bloom filter. Each read is encoded as a path in this graph, by memorizing an anchoring kmer and a list of bifurcations. T...
High-throughput sequencing data is rapidly accumulating in public repositories. Making this resource...
Large biological datasets are being produced at a rapid pace and create substantial storage challeng...
Sequencing data are rapidly accumulating in public repositories. Making this resource accessible for...
International audienceData volumes generated by next-generation sequencing (NGS) technologies is now...
Reference-free compression of high throughput sequencing data with a probabilistic de Bruijn grap
Dramatic increases in data produced by next-generation sequencing (NGS) technologies demand data com...
We present Quip, a lossless compression algorithm for next-generation sequencing data in the FASTQ a...
Supplementary material. This supplementary file contains full details of datasets used and command l...
DNA sequencing is the process of determining the ordered sequence of the four nucleotide bases in a ...
Large biological datasets are being produced at a rapid pace and create substantial storage challeng...
Over the past few years the amount of digital memory and network traffic used by sequenced biologica...
<div><p>Large biological datasets are being produced at a rapid pace and create substantial storage ...
BackgroundHigh-throughput sequencing (HTS) technologies play important roles in the life sciences by...
International audienceThis chapter deals with the compression of genomic data without reference geno...
The intensive research interest in studying genomes has led to a series of advances in DNA sequencin...
High-throughput sequencing data is rapidly accumulating in public repositories. Making this resource...
Large biological datasets are being produced at a rapid pace and create substantial storage challeng...
Sequencing data are rapidly accumulating in public repositories. Making this resource accessible for...
International audienceData volumes generated by next-generation sequencing (NGS) technologies is now...
Reference-free compression of high throughput sequencing data with a probabilistic de Bruijn grap
Dramatic increases in data produced by next-generation sequencing (NGS) technologies demand data com...
We present Quip, a lossless compression algorithm for next-generation sequencing data in the FASTQ a...
Supplementary material. This supplementary file contains full details of datasets used and command l...
DNA sequencing is the process of determining the ordered sequence of the four nucleotide bases in a ...
Large biological datasets are being produced at a rapid pace and create substantial storage challeng...
Over the past few years the amount of digital memory and network traffic used by sequenced biologica...
<div><p>Large biological datasets are being produced at a rapid pace and create substantial storage ...
BackgroundHigh-throughput sequencing (HTS) technologies play important roles in the life sciences by...
International audienceThis chapter deals with the compression of genomic data without reference geno...
The intensive research interest in studying genomes has led to a series of advances in DNA sequencin...
High-throughput sequencing data is rapidly accumulating in public repositories. Making this resource...
Large biological datasets are being produced at a rapid pace and create substantial storage challeng...
Sequencing data are rapidly accumulating in public repositories. Making this resource accessible for...