Background Searching a biological sequence database with a query sequence looking for homologues has become a routine operation in computational biology. In spite of the high degree of sophistication of currently available search routines it is still virtually impossible to identify quickly and clearly a group of sequences that a given query sequence belongs to. Results We report on our developments in grouping all known protein sequences hierarchically into superfamily and family clusters. Our graph-based algorithms take into account the topology of the sequence space induced by the data itself to construct a biologically meaningful partitioning. We have applied our clustering procedures to a non-redundant set of about 1,000,000 sequences ...
Clustering is the division of data into groups of similar objects. The main objective of this unsupe...
This dissertation is concerned with the construction and validation of an organizational framework f...
International audienceBackground: An important problem in computational biology is the automatic det...
Background Searching a biological sequence database with a query sequence looking for homologues has...
Background Searching a biological sequence database with a query sequence looking for homologues has...
Background: Searching a biological sequence database with a query sequence looking for homologues ha...
Krause A, Stoye J, Vingron M. Large scale hierarchical clustering of protein sequences. BMC Bioinfor...
Background: Genome-sequencing projects are currently producing an enormous amount of new sequences a...
International audienceMOTIVATION: Proteins can be naturally classified into families of homologous s...
International audienceMOTIVATION: Proteins can be naturally classified into families of homologous s...
A global classification of all currently known protein sequences is performed. Every protein sequenc...
One of the main reasons for protein clustering is prediction of structure, function and evolution. M...
Biological research has generated vast quantities of protein sequences. One of the current outstandi...
Clustering is the division of data into groups of similar objects. The main objective of this unsupe...
This dissertation is concerned with the construction and validation of an organizational framework f...
Clustering is the division of data into groups of similar objects. The main objective of this unsupe...
This dissertation is concerned with the construction and validation of an organizational framework f...
International audienceBackground: An important problem in computational biology is the automatic det...
Background Searching a biological sequence database with a query sequence looking for homologues has...
Background Searching a biological sequence database with a query sequence looking for homologues has...
Background: Searching a biological sequence database with a query sequence looking for homologues ha...
Krause A, Stoye J, Vingron M. Large scale hierarchical clustering of protein sequences. BMC Bioinfor...
Background: Genome-sequencing projects are currently producing an enormous amount of new sequences a...
International audienceMOTIVATION: Proteins can be naturally classified into families of homologous s...
International audienceMOTIVATION: Proteins can be naturally classified into families of homologous s...
A global classification of all currently known protein sequences is performed. Every protein sequenc...
One of the main reasons for protein clustering is prediction of structure, function and evolution. M...
Biological research has generated vast quantities of protein sequences. One of the current outstandi...
Clustering is the division of data into groups of similar objects. The main objective of this unsupe...
This dissertation is concerned with the construction and validation of an organizational framework f...
Clustering is the division of data into groups of similar objects. The main objective of this unsupe...
This dissertation is concerned with the construction and validation of an organizational framework f...
International audienceBackground: An important problem in computational biology is the automatic det...