Motivation: Storing, transmitting and archiving data produced by next-generation sequencing is a significant computational burden. New compression techniques tailored to short-read sequence data are needed. Results: We present here an approach to compression that reduces the difficulty of managing large-scale sequencing data. Our novel approach sits between pure reference-based compression and reference-free compression and combines much of the benefit of reference-based approaches with the flexibility of de novo encoding. Our method, called path encoding, draws a connection between storing paths in de Bruijn graphs and context-dependent arithmetic coding. Supporting this method is a system to compactly store sets of kmers that is of indepe...
Relative compression, where a set of similar strings are compressed with respect to a reference stri...
The ever increasing growth of the production of high-throughput sequencing data poses a serious chal...
The high throughput sequencing (HTS) platforms generate unprecedented amounts of data that introduce...
Motivation: Storing, transmitting, and archiving data produced by next generation sequencing is a si...
Motivation: The storage and transmission of high-throughput sequencing data consumes signifi-cant re...
MOTIVATION: The storage and transmission of high-throughput sequencing data consumes significant res...
BackgroundHigh-throughput sequencing (HTS) technologies play important roles in the life sciences by...
Aligning short reads produced by high throughput sequencing equipments onto a reference genome is th...
The increasing volume of biological data requires finding new ways to save these data in genetic ban...
The exponential growth of genomic data has recently motivated the development of compression algorit...
Today, Next Generation Sequencing (NGS) technologies play a vital role for many research fields such...
Holley G, Wittler R, Stoye J, Hach F. Dynamic Alignment-Free and Reference-Free Read Compression. JO...
Large biological datasets are being produced at a rapid pace and create substantial storage challeng...
Abstract Modern high-throughput sequencing technologies are able to generate DNA sequences at an ev...
DNA sequencing is the process of determining the ordered sequence of the four nucleotide bases in a ...
Relative compression, where a set of similar strings are compressed with respect to a reference stri...
The ever increasing growth of the production of high-throughput sequencing data poses a serious chal...
The high throughput sequencing (HTS) platforms generate unprecedented amounts of data that introduce...
Motivation: Storing, transmitting, and archiving data produced by next generation sequencing is a si...
Motivation: The storage and transmission of high-throughput sequencing data consumes signifi-cant re...
MOTIVATION: The storage and transmission of high-throughput sequencing data consumes significant res...
BackgroundHigh-throughput sequencing (HTS) technologies play important roles in the life sciences by...
Aligning short reads produced by high throughput sequencing equipments onto a reference genome is th...
The increasing volume of biological data requires finding new ways to save these data in genetic ban...
The exponential growth of genomic data has recently motivated the development of compression algorit...
Today, Next Generation Sequencing (NGS) technologies play a vital role for many research fields such...
Holley G, Wittler R, Stoye J, Hach F. Dynamic Alignment-Free and Reference-Free Read Compression. JO...
Large biological datasets are being produced at a rapid pace and create substantial storage challeng...
Abstract Modern high-throughput sequencing technologies are able to generate DNA sequences at an ev...
DNA sequencing is the process of determining the ordered sequence of the four nucleotide bases in a ...
Relative compression, where a set of similar strings are compressed with respect to a reference stri...
The ever increasing growth of the production of high-throughput sequencing data poses a serious chal...
The high throughput sequencing (HTS) platforms generate unprecedented amounts of data that introduce...