International audiencede Bruijn graphs play an essential role in bioinformatics, yet they lack a universal scalable representation. Here, we introduce simplitigs as a compact, efficient, and scalable representation, and ProphAsm, a fast algorithm for their computation. For the example of assemblies of model organisms and two bacterial pan-genomes, we compare simplitigs to unitigs, the best existing representation, and demonstrate that simplitigs provide a substantial improvement in the cumulative sequence length and their number. When combined with the commonly used Burrows-Wheeler Transform index, simplitigs reduce memory, and index loading and query times, as demonstrated with large-scale examples of GenBank bacterial pan-genomes
New technologies have tremendously increased sequencing throughput compared to traditional technique...
International audienceIn the life sciences, determining the sequence of bio-molecules is essential s...
Genome assembly is the problem of reconstructing genomes from DNA sequence reads. Even the best asse...
The de Bruijn graph has become a standard method in the analysis of sequencing reads in computationa...
International audienceThe de Bruijn graph plays an important role in bioinformatics, especially in t...
Background Processing of reads from high throughput sequencing is often done in term...
International audienceThe de Bruijn graph data structure is widely used in next-generation sequencin...
Abstract. The de Bruijn graph plays an important role in bioinformatics, especially in the context o...
The de Bruijn graph GK of a set of strings S is a key data structure in genome assembly that represe...
International audienceDNA sequencing data continue to progress toward longer reads with increasingly...
Marcus et al. (Bioinformatics 2014) proposed to use a compressed de Bruijn graph as a description of...
De novo genome assembly is cornerstone to modern genomics studies. It is also a useful method for st...
<p>A) Shotgun sequences are produced from two different genomes (shown in blue and red at the top). ...
International audienceWith the increasing impact of genomics in life sciences, the inference of high...
Motivation: The de Bruijn graph is a simple and efficient data structure that is used in many areas ...
New technologies have tremendously increased sequencing throughput compared to traditional technique...
International audienceIn the life sciences, determining the sequence of bio-molecules is essential s...
Genome assembly is the problem of reconstructing genomes from DNA sequence reads. Even the best asse...
The de Bruijn graph has become a standard method in the analysis of sequencing reads in computationa...
International audienceThe de Bruijn graph plays an important role in bioinformatics, especially in t...
Background Processing of reads from high throughput sequencing is often done in term...
International audienceThe de Bruijn graph data structure is widely used in next-generation sequencin...
Abstract. The de Bruijn graph plays an important role in bioinformatics, especially in the context o...
The de Bruijn graph GK of a set of strings S is a key data structure in genome assembly that represe...
International audienceDNA sequencing data continue to progress toward longer reads with increasingly...
Marcus et al. (Bioinformatics 2014) proposed to use a compressed de Bruijn graph as a description of...
De novo genome assembly is cornerstone to modern genomics studies. It is also a useful method for st...
<p>A) Shotgun sequences are produced from two different genomes (shown in blue and red at the top). ...
International audienceWith the increasing impact of genomics in life sciences, the inference of high...
Motivation: The de Bruijn graph is a simple and efficient data structure that is used in many areas ...
New technologies have tremendously increased sequencing throughput compared to traditional technique...
International audienceIn the life sciences, determining the sequence of bio-molecules is essential s...
Genome assembly is the problem of reconstructing genomes from DNA sequence reads. Even the best asse...