Graphs such as de Bruijn graphs and OLC (overlap-layout-consensus) graphs have been widely adopted for the de novo assembly of genomic short reads. This work studies another important problem in the field: how graphs can be used for high-performance compression of the large-scale sequencing data. We present a novel graph definition named Hamming-Shifting graph to address this problem. The definition originates from the technological characteristics of next-generation sequencing machines, aiming to link all pairs of distinct reads that have a small Hamming distance or a small shifting offset or both. We compute multiple lexicographically minimal k-mers to index the reads for an efficient search of the weight-lightest edges, and we prove a ve...
Graph based non-linear reference structures such as variation graphs and colored de Bruijn graphs en...
International audienceBackgroundNext Generation Sequencing (NGS) has dramatically enhanced our abili...
Whole-genome shotgun sequencing is an experimental technique used for obtaining information about a ...
The amount of sequence data has increased exponentially during the last decade. This applies especia...
The de Bruijn graph has become a standard method in the analysis of sequencing reads in computationa...
International audienceDNA sequencing data continue to progress toward longer reads with increasingly...
The growing volume of generated DNA sequencing data makes the problem of its long-term storage incre...
International audienceThe analysis of next-generation sequencing data from large genomes is a timely...
We are rapidly approaching the point where we have sequenced millions of human genomes. There is a p...
Part 1: Algorithms, Scheduling, Analysis, and Data MiningInternational audienceMassively parallel DN...
As a result of next generation sequencing technologies, during the last decade many studies have aim...
We are rapidly approaching the point where we have sequenced millions of human genomes. There is a p...
The recent advent of massively parallel sequencing technologies has drastically reduced the cost of ...
Graduation date: 2012Within the past several years the technology of high-throughput sequencing has ...
Background: Next generation sequencing technologies have greatly advanced many research areas of the...
Graph based non-linear reference structures such as variation graphs and colored de Bruijn graphs en...
International audienceBackgroundNext Generation Sequencing (NGS) has dramatically enhanced our abili...
Whole-genome shotgun sequencing is an experimental technique used for obtaining information about a ...
The amount of sequence data has increased exponentially during the last decade. This applies especia...
The de Bruijn graph has become a standard method in the analysis of sequencing reads in computationa...
International audienceDNA sequencing data continue to progress toward longer reads with increasingly...
The growing volume of generated DNA sequencing data makes the problem of its long-term storage incre...
International audienceThe analysis of next-generation sequencing data from large genomes is a timely...
We are rapidly approaching the point where we have sequenced millions of human genomes. There is a p...
Part 1: Algorithms, Scheduling, Analysis, and Data MiningInternational audienceMassively parallel DN...
As a result of next generation sequencing technologies, during the last decade many studies have aim...
We are rapidly approaching the point where we have sequenced millions of human genomes. There is a p...
The recent advent of massively parallel sequencing technologies has drastically reduced the cost of ...
Graduation date: 2012Within the past several years the technology of high-throughput sequencing has ...
Background: Next generation sequencing technologies have greatly advanced many research areas of the...
Graph based non-linear reference structures such as variation graphs and colored de Bruijn graphs en...
International audienceBackgroundNext Generation Sequencing (NGS) has dramatically enhanced our abili...
Whole-genome shotgun sequencing is an experimental technique used for obtaining information about a ...