The most widely used multiple sequence alignment methods require sequences to be clustered as an initial step. Most sequence clustering methods require a full distance matrix to be computed between all pairs of sequences. This requires memory and time proportional to N2 for N sequences. When N grows larger than 10,000 or so, this becomes increasingly prohibitive and can form a significant barrier to carrying out very large multiple alignments. In this paper, we have tested variations on a class of embedding methods that have been designed for clustering large numbers of complex objects where the individual distance calculations are expensive. These methods involve embedding the sequences in a space where the similarities within a set of seq...
In this paper, we introduce a new and highly scalable algorithm, PASTA, for large-scale multiple seq...
This paper proposes a simple and effective approach to improve the accuracy of multiple sequence ali...
Abstract. Aligning multiple DNA or protein sequences is a fundamental step in the analyses of phylog...
Since finding an optimal multiple sequences alignment is a NP-hard problem, various heuristic approa...
The Constrained Multiple Sequence Alignment problem is to align a set of sequences subject to a give...
The Constrained Multiple Sequence Alignment problem is to align a set of sequences subject to a give...
Motivation: To construct a multiple sequence alignment (MSA) of a large number (>10,000) of seque...
An essential tool in biology is the alignment of multiple sequences. Biologists use multiple sequenc...
AbstractWe consider the problem of multiple sequence alignment: given k sequences of length at most ...
Multiple sequence alignment is increasingly important to bioinformatics, with several applications r...
Abstract. Multiple Sequence Alignment (MSA) is one of the most fundamen-tal problems in computationa...
Stoye J, Perrey SW, Dress A. Improving the divide-and-conquer approach to sum-of-pairs multiple sequ...
A central focus of computational biology is to organize and make use of vast stores of molecular seq...
A central focus of computational biology is to organize and make use of vast stores of molecular seq...
The focus of this thesis is on large-scale progressive protein multiple sequence alignment algorithm...
In this paper, we introduce a new and highly scalable algorithm, PASTA, for large-scale multiple seq...
This paper proposes a simple and effective approach to improve the accuracy of multiple sequence ali...
Abstract. Aligning multiple DNA or protein sequences is a fundamental step in the analyses of phylog...
Since finding an optimal multiple sequences alignment is a NP-hard problem, various heuristic approa...
The Constrained Multiple Sequence Alignment problem is to align a set of sequences subject to a give...
The Constrained Multiple Sequence Alignment problem is to align a set of sequences subject to a give...
Motivation: To construct a multiple sequence alignment (MSA) of a large number (>10,000) of seque...
An essential tool in biology is the alignment of multiple sequences. Biologists use multiple sequenc...
AbstractWe consider the problem of multiple sequence alignment: given k sequences of length at most ...
Multiple sequence alignment is increasingly important to bioinformatics, with several applications r...
Abstract. Multiple Sequence Alignment (MSA) is one of the most fundamen-tal problems in computationa...
Stoye J, Perrey SW, Dress A. Improving the divide-and-conquer approach to sum-of-pairs multiple sequ...
A central focus of computational biology is to organize and make use of vast stores of molecular seq...
A central focus of computational biology is to organize and make use of vast stores of molecular seq...
The focus of this thesis is on large-scale progressive protein multiple sequence alignment algorithm...
In this paper, we introduce a new and highly scalable algorithm, PASTA, for large-scale multiple seq...
This paper proposes a simple and effective approach to improve the accuracy of multiple sequence ali...
Abstract. Aligning multiple DNA or protein sequences is a fundamental step in the analyses of phylog...