International audienceUsing a trimming approach, we investigate a k-means type method based on Bregman divergences for clustering data possibly corrupted with clutter noise. The main interest of Bregman divergences is that the standard Lloyd algorithm adapts to these distortion measures, and they are well-suited for clustering data sampled according to mixture models from exponential families. We prove that there exists an optimal codebook, and that an empirically optimal codebook converges a.s. to an optimal codebook in the distortion sense. Moreover, we obtain the sub-Gaussian rate of convergence for k-means 1 √ n under mild tail assumptions. Also, we derive a Lloyd-type algorithm with a trimming parameter that can be selected from data a...
Bregman divergences play a central role in the design and analysis of a range of machine learning al...
Abstract—An idealized clustering algorithm seeks to learn a cluster-adjacency matrix such that, if t...
The Euclidean K-means problem is fundamental to clustering and over the years it has been intensely ...
A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Ma...
A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Ma...
The k-means method is the method of choice for clustering large-scale data sets and it performs exce...
We review Bregman divergences and use them in clustering algorithms which we have previously develop...
Scalable clustering algorithms that can work with a wide variety of distance measures and also incor...
International audienceThe scope of the well-known $k$-means algorithm has been broadly extended with...
The $k$-means algorithm is the method of choice for clustering large-scale data sets and it performs...
We review Bregman divergences and use them in clustering algorithms which we have previously develop...
In traditional clustering, every data point is assigned to at least one cluster. On the other extrem...
In this note, we introduce a new algorithm to deal with finite dimensional clustering with errors in...
Bregman divergences generalize measures such as the squared Euclidean distance and the KL divergenc...
The link with exponential families has allowed $k$-means clustering to be generalized to a wide vari...
Bregman divergences play a central role in the design and analysis of a range of machine learning al...
Abstract—An idealized clustering algorithm seeks to learn a cluster-adjacency matrix such that, if t...
The Euclidean K-means problem is fundamental to clustering and over the years it has been intensely ...
A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Ma...
A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Ma...
The k-means method is the method of choice for clustering large-scale data sets and it performs exce...
We review Bregman divergences and use them in clustering algorithms which we have previously develop...
Scalable clustering algorithms that can work with a wide variety of distance measures and also incor...
International audienceThe scope of the well-known $k$-means algorithm has been broadly extended with...
The $k$-means algorithm is the method of choice for clustering large-scale data sets and it performs...
We review Bregman divergences and use them in clustering algorithms which we have previously develop...
In traditional clustering, every data point is assigned to at least one cluster. On the other extrem...
In this note, we introduce a new algorithm to deal with finite dimensional clustering with errors in...
Bregman divergences generalize measures such as the squared Euclidean distance and the KL divergenc...
The link with exponential families has allowed $k$-means clustering to be generalized to a wide vari...
Bregman divergences play a central role in the design and analysis of a range of machine learning al...
Abstract—An idealized clustering algorithm seeks to learn a cluster-adjacency matrix such that, if t...
The Euclidean K-means problem is fundamental to clustering and over the years it has been intensely ...