In this paper, we investigate stability-based methods for cluster model selection, in particular to select the number K of clusters. The scenario under consideration is that clustering is performed by minimizing a certain clustering quality function, and that a unique global minimizer exists. On the one hand we show that stability can be upper bounded by certain properties of the optimal clustering, namely by the mass in a small tube around the cluster boundaries. On the other hand, we provide counterexamples which show that a reverse statement is not true in general. Finally, we give some examples and arguments why, from a theoretic point of view, using clustering stability in a high sample setting can be problematic. It can be seen that d...
A unified theory is presented to assess the robustness of general clustering methods (GCM), i.e., me...
In cluster analysis, selecting the number of clusters is an "ill-posed" problem of crucial importanc...
AbstractTwo robustness criteria are presented that are applicable to general clustering methods. Rob...
In this paper, we investigate stability-based methods for cluster model selection, in particular to ...
A popular method for selecting the number of clusters is based on stability arguments: one chooses t...
Stability is a common tool to verify the validity of sample based algorithms. In clustering it is wi...
Stability is a common tool to verify the validity of sample based algorithms. In clustering it is wi...
A popular method for selecting the number of clusters is based on sta-bility arguments: one chooses ...
Over the past few years, the notion of stability in data clustering has received growing attention a...
We phrase K-means clustering as an empirical risk minimization procedure over a class HK and explici...
Optimal clustering is a notoriously hard task. Recently, several papers have suggested a new approac...
Selecting the number of clusters is one of the greatest challenges in clustering analysis. In this t...
The advent of high throughput technologies, in particular microarrays, for biological research has r...
Includes bibliographical references (p. 56-58).We present an algorithm called HS-means, which is abl...
Among the areas of data and text mining which are employed today in OR, science, economy and technol...
A unified theory is presented to assess the robustness of general clustering methods (GCM), i.e., me...
In cluster analysis, selecting the number of clusters is an "ill-posed" problem of crucial importanc...
AbstractTwo robustness criteria are presented that are applicable to general clustering methods. Rob...
In this paper, we investigate stability-based methods for cluster model selection, in particular to ...
A popular method for selecting the number of clusters is based on stability arguments: one chooses t...
Stability is a common tool to verify the validity of sample based algorithms. In clustering it is wi...
Stability is a common tool to verify the validity of sample based algorithms. In clustering it is wi...
A popular method for selecting the number of clusters is based on sta-bility arguments: one chooses ...
Over the past few years, the notion of stability in data clustering has received growing attention a...
We phrase K-means clustering as an empirical risk minimization procedure over a class HK and explici...
Optimal clustering is a notoriously hard task. Recently, several papers have suggested a new approac...
Selecting the number of clusters is one of the greatest challenges in clustering analysis. In this t...
The advent of high throughput technologies, in particular microarrays, for biological research has r...
Includes bibliographical references (p. 56-58).We present an algorithm called HS-means, which is abl...
Among the areas of data and text mining which are employed today in OR, science, economy and technol...
A unified theory is presented to assess the robustness of general clustering methods (GCM), i.e., me...
In cluster analysis, selecting the number of clusters is an "ill-posed" problem of crucial importanc...
AbstractTwo robustness criteria are presented that are applicable to general clustering methods. Rob...