A global classification of all currently known protein sequences is performed. Every protein sequence is partitioned into segments of 50 amino acids and a dynamicprogramming distance is calculated between each pair of segments. This space of segments is first embedded into Euclidean space with small metric distortion. A novel self-organized cross-validated clustering algorithm is then applied to the embedded space with Euclidean distances. The resulting hierarchical tree of clusters offers a new representation of protein sequences and families, which compares favorably with the most updated classifications based on functional and structural protein data. Motifs and domains such as the Zinc Finger, EF hand, Homeobox, EGF-like and others are ...
We developed a method based on hierarchical self-organizing maps (SOMs) to recognize patterns in pro...
Next-generation sequencing has allowed many new protein sequences to be identified. However, this ex...
Krause A, Stoye J, Vingron M. Large scale hierarchical clustering of protein sequences. BMC Bioinfor...
Background Searching a biological sequence database with a query sequence looking for homologues has...
Background Searching a biological sequence database with a query sequence looking for homologues has...
Background Searching a biological sequence database with a query sequence looking for homologues has...
We investigate the space of all protein sequences. We combine the standard measures of similarity (S...
We investigate the space of all protein sequences. We combine the standard measures of similarity (S...
This dissertation is concerned with the construction and validation of an organizational framework f...
This dissertation is concerned with the construction and validation of an organizational framework f...
Biological research has generated vast quantities of protein sequences. One of the current outstandi...
Abstract Background New computational resources are needed to manage the increasing volume of biolog...
Background: Searching a biological sequence database with a query sequence looking for homologues ha...
Protein sequence analysis is an important task in bioinformatics. The classification of protein sequ...
Establishing structure-function relationships on the proteomic scale is a unique challenge faced by ...
We developed a method based on hierarchical self-organizing maps (SOMs) to recognize patterns in pro...
Next-generation sequencing has allowed many new protein sequences to be identified. However, this ex...
Krause A, Stoye J, Vingron M. Large scale hierarchical clustering of protein sequences. BMC Bioinfor...
Background Searching a biological sequence database with a query sequence looking for homologues has...
Background Searching a biological sequence database with a query sequence looking for homologues has...
Background Searching a biological sequence database with a query sequence looking for homologues has...
We investigate the space of all protein sequences. We combine the standard measures of similarity (S...
We investigate the space of all protein sequences. We combine the standard measures of similarity (S...
This dissertation is concerned with the construction and validation of an organizational framework f...
This dissertation is concerned with the construction and validation of an organizational framework f...
Biological research has generated vast quantities of protein sequences. One of the current outstandi...
Abstract Background New computational resources are needed to manage the increasing volume of biolog...
Background: Searching a biological sequence database with a query sequence looking for homologues ha...
Protein sequence analysis is an important task in bioinformatics. The classification of protein sequ...
Establishing structure-function relationships on the proteomic scale is a unique challenge faced by ...
We developed a method based on hierarchical self-organizing maps (SOMs) to recognize patterns in pro...
Next-generation sequencing has allowed many new protein sequences to be identified. However, this ex...
Krause A, Stoye J, Vingron M. Large scale hierarchical clustering of protein sequences. BMC Bioinfor...