We phrase K-means clustering as an empirical risk minimization procedure over a class HK and explicitly calculate the covering number for this class. Next, we show that stability of K-means clustering is characterized by the geometry of HK with respect to the underlying distribution. We prove that in the case of a unique global minimizer, the clustering solution is stable with respect to complete changes of the data, while for the case of multiple minimizers, the change of Ω(n 1/2) samples defines the transition between stability and instability. While for a finite number of minimizers this result follows from multinomial distribution estimates, the case of infinite minimizers requires more refined tools. We conclude by proving that stabili...
A novel center-based clustering algorithm is proposed in this paper. We first for-mulate clustering ...
We investigate the role of the initialization for the stability of the қ-means clustering ...
Over the past few years, the notion of stability in data clustering has received growing attention a...
In this paper, we investigate stability-based methods for cluster model selection, in particular to ...
Stability is a common tool to verify the validity of sample based algorithms. In clustering it is wi...
Stability is a common tool to verify the validity of sample based algorithms. In clustering it is wi...
International audienceIn this paper, we define and study a new notion of stability for the $k$-means...
A popular method for selecting the number of clusters is based on stability arguments: one chooses t...
Optimal clustering is a notoriously hard task. Recently, several papers have suggested a new approac...
K-means clustering is widely used for exploratory data analysis. While its dependence on initialisat...
A popular method for selecting the number of clusters is based on sta-bility arguments: one chooses ...
Probably the most famous clustering formulation is k-means. This is the focus today. Note: k-means i...
K-Means is one of the most popular clustering algorithms, and it is easy to implement It seeks to m...
For improving the performance of K-means on the nonconvex cluster, a multiple-means clustering metho...
Abstract—We consider k-median clustering in finite metric spaces and k-means clustering in Euclidean...
A novel center-based clustering algorithm is proposed in this paper. We first for-mulate clustering ...
We investigate the role of the initialization for the stability of the қ-means clustering ...
Over the past few years, the notion of stability in data clustering has received growing attention a...
In this paper, we investigate stability-based methods for cluster model selection, in particular to ...
Stability is a common tool to verify the validity of sample based algorithms. In clustering it is wi...
Stability is a common tool to verify the validity of sample based algorithms. In clustering it is wi...
International audienceIn this paper, we define and study a new notion of stability for the $k$-means...
A popular method for selecting the number of clusters is based on stability arguments: one chooses t...
Optimal clustering is a notoriously hard task. Recently, several papers have suggested a new approac...
K-means clustering is widely used for exploratory data analysis. While its dependence on initialisat...
A popular method for selecting the number of clusters is based on sta-bility arguments: one chooses ...
Probably the most famous clustering formulation is k-means. This is the focus today. Note: k-means i...
K-Means is one of the most popular clustering algorithms, and it is easy to implement It seeks to m...
For improving the performance of K-means on the nonconvex cluster, a multiple-means clustering metho...
Abstract—We consider k-median clustering in finite metric spaces and k-means clustering in Euclidean...
A novel center-based clustering algorithm is proposed in this paper. We first for-mulate clustering ...
We investigate the role of the initialization for the stability of the қ-means clustering ...
Over the past few years, the notion of stability in data clustering has received growing attention a...