International audienceThe de Bruijn graph data structure is widely used in next-generation sequencing (NGS). Many programs, e.g. de novo assemblers, rely on in-memory representation of this graph. However, current techniques for representing the de Bruijn graph of a human genome require a large amount of memory (> 30 GB). We propose a new encoding of the de Bruijn graph, which occupies an order of magnitude less space than current representations. The encoding is based on a Bloom filter, with an additional structure to remove critical false positives. An assembly software implementing this structure, Minia, performed a complete de novo assembly of human genome short reads using 5.7 Gb of memory in 23 hours
De novo genome assembly is cornerstone to modern genomics studies. It is also a useful method for st...
Emerging next-generation sequencing technologies have opened up exciting new opportunities for genom...
International audienceAs datasets of DNA reads grow rapidly, it becomes more and more important to r...
International audienceThe de Bruijn graph data structure is widely used in next-generation sequencin...
International audienceThe de Bruijn graph plays an important role in bioinformatics, especially in t...
Abstract. De Brujin graphs are widely used in bioinformatics for pro-cessing next-generation sequenc...
12 pages, submittedInternational audienceDe Brujin graphs are widely used in bioinformatics for proc...
Abstract. The de Bruijn graph plays an important role in bioinformatics, especially in the context o...
The de Bruijn graph has become a standard method in the analysis of sequencing reads in computationa...
Background Processing of reads from high throughput sequencing is often done in term...
International audienceDNA sequencing data continue to progress toward longer reads with increasingly...
Motivation: The de Bruijn graph is a simple and efficient data structure that is used in many areas ...
The recent advent of massively parallel sequencing technologies has drastically reduced the cost of ...
Abstract. High throughput sequencing technologies have become fast and cheap in the past years. As a...
De novo genome assembly is cornerstone to modern genomics studies. It is also a useful method for st...
Emerging next-generation sequencing technologies have opened up exciting new opportunities for genom...
International audienceAs datasets of DNA reads grow rapidly, it becomes more and more important to r...
International audienceThe de Bruijn graph data structure is widely used in next-generation sequencin...
International audienceThe de Bruijn graph plays an important role in bioinformatics, especially in t...
Abstract. De Brujin graphs are widely used in bioinformatics for pro-cessing next-generation sequenc...
12 pages, submittedInternational audienceDe Brujin graphs are widely used in bioinformatics for proc...
Abstract. The de Bruijn graph plays an important role in bioinformatics, especially in the context o...
The de Bruijn graph has become a standard method in the analysis of sequencing reads in computationa...
Background Processing of reads from high throughput sequencing is often done in term...
International audienceDNA sequencing data continue to progress toward longer reads with increasingly...
Motivation: The de Bruijn graph is a simple and efficient data structure that is used in many areas ...
The recent advent of massively parallel sequencing technologies has drastically reduced the cost of ...
Abstract. High throughput sequencing technologies have become fast and cheap in the past years. As a...
De novo genome assembly is cornerstone to modern genomics studies. It is also a useful method for st...
Emerging next-generation sequencing technologies have opened up exciting new opportunities for genom...
International audienceAs datasets of DNA reads grow rapidly, it becomes more and more important to r...