We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, and apply it to the assembly of a dataset of paired 75-bp Illumina reads derived from the 15.4 megabase genome of the haploid yeast Pichia stipitis. More than 95% of the genome is recovered, with no errors; half the assembled sequence is in contigs longer than 101 kilobases and in scaffolds longer than 269 kilobases. Incorporating fosmid ends recovers entire chromosomes. Meraculous relies on an efficient and conservative traversal of the subgraph of the k-mer (deBruijn) graph of oligonucleotides with unique high quality extensions in the dataset, avoiding an explicit error correction step as used in other short-read assemblers. A novel memory-...
Abstract—De novo whole genome assembly reconstructs ge-nomic sequence from short, overlapping, and p...
Long-read sequencing technologies such as Pacific Biosciences and Oxford Nanopore MinION are capable...
Genome assembly is the problem of reconstructing genomes from DNA sequence reads. Even the best asse...
We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, a...
We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, a...
A critical problem for computational genomics is the problem of de novo genome assembly: the develop...
De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially...
One of the most significant advances in biology has been the ability to sequence the DNA of organism...
The latest revolution in the DNA sequencing field has been brought about by the development of autom...
De novo Genome assembly and k-mer frequency counting are two of the classical prob- lems of Bioinfo...
The recent advent of massively parallel sequencing technologies has drastically reduced the cost of ...
BackgroundGenomic data have become major resources to understand complex mechanisms at fine-scale te...
Next-Generation-Sequencing is advantageous because of its much higher data throughput and much lower...
We study data-efficient and also practical de-novo genome assembly algorithm. Due to the advancement...
Abstract Background Short read DNA sequencing technologies have revolutionized genome assembly by pr...
Abstract—De novo whole genome assembly reconstructs ge-nomic sequence from short, overlapping, and p...
Long-read sequencing technologies such as Pacific Biosciences and Oxford Nanopore MinION are capable...
Genome assembly is the problem of reconstructing genomes from DNA sequence reads. Even the best asse...
We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, a...
We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, a...
A critical problem for computational genomics is the problem of de novo genome assembly: the develop...
De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially...
One of the most significant advances in biology has been the ability to sequence the DNA of organism...
The latest revolution in the DNA sequencing field has been brought about by the development of autom...
De novo Genome assembly and k-mer frequency counting are two of the classical prob- lems of Bioinfo...
The recent advent of massively parallel sequencing technologies has drastically reduced the cost of ...
BackgroundGenomic data have become major resources to understand complex mechanisms at fine-scale te...
Next-Generation-Sequencing is advantageous because of its much higher data throughput and much lower...
We study data-efficient and also practical de-novo genome assembly algorithm. Due to the advancement...
Abstract Background Short read DNA sequencing technologies have revolutionized genome assembly by pr...
Abstract—De novo whole genome assembly reconstructs ge-nomic sequence from short, overlapping, and p...
Long-read sequencing technologies such as Pacific Biosciences and Oxford Nanopore MinION are capable...
Genome assembly is the problem of reconstructing genomes from DNA sequence reads. Even the best asse...