We assess the robustness of partitional clustering algorithms applied to gene expression data. A number of clusterings are made with identical parameter settings and input data using SOM and k-means algorithms, which both rely on random initialisation and may produce different clusterings with different seeds. We define a reproducibility index and use it to assess the algorithms. The index is based on the number of pairs of genes consistently clustered together in different clusterings. The effect of noise applied to the original data is also studied. Our results show a lack of robustness for both classes of algorithms, with slightly higher reproducibility for SOM than for k-means
Thesis (Ph. D.)--University of Washington, 2001The invention of DNA microarrays allows us to study s...
Many clustering algorithms have been used to analyze microarray gene expression data. Given embryoni...
Discovery of disease sub-types is one of the fundamental problem in clinical applications. This is ...
Abstract: We assess the robustness of partitional clustering algorithms applied to gene expression d...
The progress in microarray technology is evident and huge amounts of gene expression data are curren...
Abstract Background Cluster analysis is an integral part of high dimensional data analysis. In the c...
Abstract. Motivation: Many clustering algorithms have been proposed for the analysis of gene expr...
Motivation: A measurement of cluster quality is needed to choose potential clusters of genes that co...
Data mining technique used in the field of clustering is a subject of active research and assists in...
In the rapidly evolving field of genomics, many clustering and classification methods have been deve...
Clustering algorithms aim, by definition, at partitioning a given set of objects into a set of clust...
Massively high-dimensional datasets are fast becoming commonplace and any advances in the reliable p...
In data analysis, clustering is the process of finding groups in unlabelled data according to simila...
This paper illustrates some of the problems which can occur in any data set when clustering samples ...
We have previously described a statistical framework for using gene expression data from cDNA microa...
Thesis (Ph. D.)--University of Washington, 2001The invention of DNA microarrays allows us to study s...
Many clustering algorithms have been used to analyze microarray gene expression data. Given embryoni...
Discovery of disease sub-types is one of the fundamental problem in clinical applications. This is ...
Abstract: We assess the robustness of partitional clustering algorithms applied to gene expression d...
The progress in microarray technology is evident and huge amounts of gene expression data are curren...
Abstract Background Cluster analysis is an integral part of high dimensional data analysis. In the c...
Abstract. Motivation: Many clustering algorithms have been proposed for the analysis of gene expr...
Motivation: A measurement of cluster quality is needed to choose potential clusters of genes that co...
Data mining technique used in the field of clustering is a subject of active research and assists in...
In the rapidly evolving field of genomics, many clustering and classification methods have been deve...
Clustering algorithms aim, by definition, at partitioning a given set of objects into a set of clust...
Massively high-dimensional datasets are fast becoming commonplace and any advances in the reliable p...
In data analysis, clustering is the process of finding groups in unlabelled data according to simila...
This paper illustrates some of the problems which can occur in any data set when clustering samples ...
We have previously described a statistical framework for using gene expression data from cDNA microa...
Thesis (Ph. D.)--University of Washington, 2001The invention of DNA microarrays allows us to study s...
Many clustering algorithms have been used to analyze microarray gene expression data. Given embryoni...
Discovery of disease sub-types is one of the fundamental problem in clinical applications. This is ...