This paper describes a new technique for parallelizing protein clustering, an important bioinformatics computation for the analysis of protein sequences. Protein clustering identifies groups of proteins that are similar because they share long sequences of similar amino acids. Given a collection of protein sequences, clustering can significantly reduce the computational effort required to identify all similar sequences by avoiding many negative comparisons. The challenge, however, is to build a clustering that misses as few similar sequences (or elements, more generally) as possible. In this paper, we introduce precise clustering, a property that requires each pair of similar elements to appear together in at least one cluster. We show t...
Clustering is the division of data into groups of similar objects. The main objective of this unsupe...
BACKGROUNDS: Recent explosion of biological data brings a great challenge for the traditional cluste...
Finding homologous proteins (or cluster of homologous proteins) is a very important, since this info...
This paper describes a new technique for parallelizing protein clustering, an important bioinformati...
Proteins are macromolecules that play a pivotal role in biological processes in living organisms. St...
One of the main reasons for protein clustering is prediction of structure, function and evolution. M...
In recent years, we have seen an enormous growth in the amount of available commercial and scientifi...
We present a fast algorithm for sequence clustering and searching which works with large sequence da...
International audienceThis paper presents SpCLUST, a new C++ package that takes a list of sequences ...
Background Fueled by rapid progress in high-throughput sequencing, the size of public sequence datab...
In this paper, a technique to reduce time and space during protein sequence clustering and classific...
Visualization of a large-scale protein databases may help biologists in discovering similarity betwe...
An important problem in genomics is automatic-ally clustering homologous proteins when only sequence...
International audienceBackground: An important problem in computational biology is the automatic det...
An important problem in genomics is automatically clustering homologous proteins when only sequence ...
Clustering is the division of data into groups of similar objects. The main objective of this unsupe...
BACKGROUNDS: Recent explosion of biological data brings a great challenge for the traditional cluste...
Finding homologous proteins (or cluster of homologous proteins) is a very important, since this info...
This paper describes a new technique for parallelizing protein clustering, an important bioinformati...
Proteins are macromolecules that play a pivotal role in biological processes in living organisms. St...
One of the main reasons for protein clustering is prediction of structure, function and evolution. M...
In recent years, we have seen an enormous growth in the amount of available commercial and scientifi...
We present a fast algorithm for sequence clustering and searching which works with large sequence da...
International audienceThis paper presents SpCLUST, a new C++ package that takes a list of sequences ...
Background Fueled by rapid progress in high-throughput sequencing, the size of public sequence datab...
In this paper, a technique to reduce time and space during protein sequence clustering and classific...
Visualization of a large-scale protein databases may help biologists in discovering similarity betwe...
An important problem in genomics is automatic-ally clustering homologous proteins when only sequence...
International audienceBackground: An important problem in computational biology is the automatic det...
An important problem in genomics is automatically clustering homologous proteins when only sequence ...
Clustering is the division of data into groups of similar objects. The main objective of this unsupe...
BACKGROUNDS: Recent explosion of biological data brings a great challenge for the traditional cluste...
Finding homologous proteins (or cluster of homologous proteins) is a very important, since this info...