In this paper, we introduce a new and highly scalable algorithm, PASTA, for large-scale multiple sequence alignment estimation. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy of the leading alignment methods on large datasets, and is able to analyze much larger datasets than the current methods. We also show that trees estimated on PASTA alignments are highly accurate – slightly better than SATe ́ trees, but with substantial improvements rela-tive to other methods. Finally, PASTA is very fast, hi...
Abstract Background We propose a multiple sequence alignment (MSA) algorithm and compare the alignme...
With the rapid development of genome sequencing, an ever-increasing number of molecular biology anal...
This dataset contains a GitHub repository containing all the data, analysis, Nextflow workflows and ...
Abstract Background Multiple sequence alignment is an important task in bioinformatics, and alignmen...
The focus of this thesis is on large-scale progressive protein multiple sequence alignment algorithm...
Multiple sequence alignments (MSAs) are used for structural1,2 and evolutionary predictions1,2, but ...
In these days of significant changes and the rapid evolution of technology, the amount of datascienc...
A central focus of computational biology is to organize and make use of vast stores of molecular seq...
A central focus of computational biology is to organize and make use of vast stores of molecular seq...
Sequence alignment has become a routine procedure in evolutionary biology in looking for evolutionar...
We consider the problem of aligning two very long biological sequences. The score for the best align...
The challenge of comparing two or more genomes that have undergone recombination and substantial amo...
An essential tool in biology is the alignment of multiple sequences. Biologists use multiple sequenc...
Motivation: To construct a multiple sequence alignment (MSA) of a large number (>10,000) of seque...
Abstract Background Continuing research into the global multiple sequence alignment problem has resu...
Abstract Background We propose a multiple sequence alignment (MSA) algorithm and compare the alignme...
With the rapid development of genome sequencing, an ever-increasing number of molecular biology anal...
This dataset contains a GitHub repository containing all the data, analysis, Nextflow workflows and ...
Abstract Background Multiple sequence alignment is an important task in bioinformatics, and alignmen...
The focus of this thesis is on large-scale progressive protein multiple sequence alignment algorithm...
Multiple sequence alignments (MSAs) are used for structural1,2 and evolutionary predictions1,2, but ...
In these days of significant changes and the rapid evolution of technology, the amount of datascienc...
A central focus of computational biology is to organize and make use of vast stores of molecular seq...
A central focus of computational biology is to organize and make use of vast stores of molecular seq...
Sequence alignment has become a routine procedure in evolutionary biology in looking for evolutionar...
We consider the problem of aligning two very long biological sequences. The score for the best align...
The challenge of comparing two or more genomes that have undergone recombination and substantial amo...
An essential tool in biology is the alignment of multiple sequences. Biologists use multiple sequenc...
Motivation: To construct a multiple sequence alignment (MSA) of a large number (>10,000) of seque...
Abstract Background Continuing research into the global multiple sequence alignment problem has resu...
Abstract Background We propose a multiple sequence alignment (MSA) algorithm and compare the alignme...
With the rapid development of genome sequencing, an ever-increasing number of molecular biology anal...
This dataset contains a GitHub repository containing all the data, analysis, Nextflow workflows and ...