In cluster analysis, selecting the number of clusters is an "ill-posed" problem of crucial importance. In this paper we propose a re-sampling method for assessing cluster stability. Our model suggests that samples' occurrences in clusters can be considered as realizations of the same random variable in the case of the "true" number of clusters. Thus, similarity between different cluster solutions is measured by means of compound and simple probability metrics. Compound criteria result in validation rules employing the stability content of clusters. Simple probability metrics, in particular those based on kernels, provide more flexible geometrical criteria. We analyze several applications of probability metrics combined with methods intended...
Statistical inference based on the cluster weighted model often requires some subjective judgment fr...
In this work, a novel technique to address the problem of cluster validation based on cluster stabil...
Clustering analysis seeks to partition a given dataset into groups or clusters so that the data obje...
The estimation of the appropriate number of clusters is a known problem in cluster anal-ysis, that a...
We improve instability-based methods for the selection of the number of clusters k in cluster analys...
A popular method for selecting the number of clusters is based on sta-bility arguments: one chooses ...
A popular method for selecting the number of clusters is based on stability arguments: one chooses t...
In this paper, we investigate stability-based methods for cluster model selection, in particular to ...
The assessment of stability in cluster analysis is strongly related to the main difficult problem of...
Over the past few years, the notion of stability in data clustering has received growing attention a...
The advent of high throughput technologies, in particular microarrays, for biological research has r...
This thesis considers four important issues in cluster analysis: cluster validation, estimation of ...
This work is addressed to the problem of cluster validation to determine the right number of cluster...
AbstractThe advent of high throughput technologies, in particular microarrays, for biological resear...
Includes bibliographical references (p. 56-58).We present an algorithm called HS-means, which is abl...
Statistical inference based on the cluster weighted model often requires some subjective judgment fr...
In this work, a novel technique to address the problem of cluster validation based on cluster stabil...
Clustering analysis seeks to partition a given dataset into groups or clusters so that the data obje...
The estimation of the appropriate number of clusters is a known problem in cluster anal-ysis, that a...
We improve instability-based methods for the selection of the number of clusters k in cluster analys...
A popular method for selecting the number of clusters is based on sta-bility arguments: one chooses ...
A popular method for selecting the number of clusters is based on stability arguments: one chooses t...
In this paper, we investigate stability-based methods for cluster model selection, in particular to ...
The assessment of stability in cluster analysis is strongly related to the main difficult problem of...
Over the past few years, the notion of stability in data clustering has received growing attention a...
The advent of high throughput technologies, in particular microarrays, for biological research has r...
This thesis considers four important issues in cluster analysis: cluster validation, estimation of ...
This work is addressed to the problem of cluster validation to determine the right number of cluster...
AbstractThe advent of high throughput technologies, in particular microarrays, for biological resear...
Includes bibliographical references (p. 56-58).We present an algorithm called HS-means, which is abl...
Statistical inference based on the cluster weighted model often requires some subjective judgment fr...
In this work, a novel technique to address the problem of cluster validation based on cluster stabil...
Clustering analysis seeks to partition a given dataset into groups or clusters so that the data obje...