Next-generation sequencing (NGS) technologies have generated enormous amounts of shotgun read data, and assembly of the reads can be challenging, especially for organisms without template sequences. We study the power of genome comparison based on shotgun read data without assembly using three alignment-free sequence comparison statistics, D-2, D-2*, and D-2(S), both theoretically and by simulations. Theoretical formulas for the power of detecting the relationship between two sequences related through a common motif model are derived. It is shown that both D-2* and D-2(S) outperform D2 for detecting the relationship between two sequences based on NGS data. We then study the effects of length of the tuple, read length, coverage, and sequenci...
Alignment-free methods, in which shared properties of sub-sequences (e.g. identity or match length) ...
Phylogenetics and population genetics are central disciplines in evolutionary biology. Both are base...
Phylogenetics and population genetics are central disciplines in evolutionary biology. Both are base...
With the development of next-generation sequencing (NGS) technologies, a large amount of short read ...
Motivation: Next-generation sequencing (NGS) technologies generate large amounts of short read data ...
Genome and metagenome comparisons based on large amounts of nextgeneration sequencing (NGS) data pos...
Background Next Generation Sequencing (NGS) machines extract from a biological sample a large numbe...
Genome and metagenome comparisons based on large amounts of nextgeneration sequencing (NGS) data pos...
This study focuses on an alignment-free sequence comparison method: the number of words of length k ...
This study focuses on an alignment-free sequence comparison method: the number of words of length k ...
The development of high-throughput Next Generation Sequencing (NGS) technologies allows to massively...
The D2 statistic is defined as the number of word matches of prespecified length k, with up to t mis...
The D2 statistic is defined as the number of word matches of prespecified length k, with up to t mis...
Alignment-free methods, in which shared properties of sub-sequences (e. g. identity or match length)...
Motivation: Alignment-free sequence comparison methods are still in the early stages of development ...
Alignment-free methods, in which shared properties of sub-sequences (e.g. identity or match length) ...
Phylogenetics and population genetics are central disciplines in evolutionary biology. Both are base...
Phylogenetics and population genetics are central disciplines in evolutionary biology. Both are base...
With the development of next-generation sequencing (NGS) technologies, a large amount of short read ...
Motivation: Next-generation sequencing (NGS) technologies generate large amounts of short read data ...
Genome and metagenome comparisons based on large amounts of nextgeneration sequencing (NGS) data pos...
Background Next Generation Sequencing (NGS) machines extract from a biological sample a large numbe...
Genome and metagenome comparisons based on large amounts of nextgeneration sequencing (NGS) data pos...
This study focuses on an alignment-free sequence comparison method: the number of words of length k ...
This study focuses on an alignment-free sequence comparison method: the number of words of length k ...
The development of high-throughput Next Generation Sequencing (NGS) technologies allows to massively...
The D2 statistic is defined as the number of word matches of prespecified length k, with up to t mis...
The D2 statistic is defined as the number of word matches of prespecified length k, with up to t mis...
Alignment-free methods, in which shared properties of sub-sequences (e. g. identity or match length)...
Motivation: Alignment-free sequence comparison methods are still in the early stages of development ...
Alignment-free methods, in which shared properties of sub-sequences (e.g. identity or match length) ...
Phylogenetics and population genetics are central disciplines in evolutionary biology. Both are base...
Phylogenetics and population genetics are central disciplines in evolutionary biology. Both are base...