Scalable clustering algorithms that can work with a wide variety of distance measures and also incorporate application specific requirements are critically important for modern day data analysis and predictive modeling. In this thesis, we propose and analyze a large class of such algorithms, evaluate their performance on benchmark datasets and investigate theoretical connections of the proposed algorithms to lossy compression and stochastic prediction. First, a wide variety of popular centroid based clustering algorithms are unified using a large class of distance measures known as Bregman divergences. We present both hard and soft-clustering algorithms using Bregman divergences. By establishing a bijection between regular exponential...
textIn classical clustering, each data point is assigned to at least one cluster. However, in many ...
The Euclidean K-means problem is fundamental to clustering and over the years it has been intensely ...
This book summarizes the state-of-the-art in partitional clustering. Clustering, the unsupervised cl...
A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Ma...
A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Ma...
International audienceUsing a trimming approach, we investigate a k-means type method based on Bregm...
We review Bregman divergences and use them in clustering algorithms which we have previously develop...
The k-means method is the method of choice for clustering large-scale data sets and it performs exce...
In traditional clustering, every data point is assigned to at least one cluster. On the other extrem...
A growing number of data-based applications are used for decision-making that have far-reaching cons...
Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, pri...
We review Bregman divergences and use them in clustering algorithms which we have previously develop...
International audienceThe scope of the well-known $k$-means algorithm has been broadly extended with...
Abstract—An idealized clustering algorithm seeks to learn a cluster-adjacency matrix such that, if t...
The $k$-means algorithm is the method of choice for clustering large-scale data sets and it performs...
textIn classical clustering, each data point is assigned to at least one cluster. However, in many ...
The Euclidean K-means problem is fundamental to clustering and over the years it has been intensely ...
This book summarizes the state-of-the-art in partitional clustering. Clustering, the unsupervised cl...
A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Ma...
A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Ma...
International audienceUsing a trimming approach, we investigate a k-means type method based on Bregm...
We review Bregman divergences and use them in clustering algorithms which we have previously develop...
The k-means method is the method of choice for clustering large-scale data sets and it performs exce...
In traditional clustering, every data point is assigned to at least one cluster. On the other extrem...
A growing number of data-based applications are used for decision-making that have far-reaching cons...
Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, pri...
We review Bregman divergences and use them in clustering algorithms which we have previously develop...
International audienceThe scope of the well-known $k$-means algorithm has been broadly extended with...
Abstract—An idealized clustering algorithm seeks to learn a cluster-adjacency matrix such that, if t...
The $k$-means algorithm is the method of choice for clustering large-scale data sets and it performs...
textIn classical clustering, each data point is assigned to at least one cluster. However, in many ...
The Euclidean K-means problem is fundamental to clustering and over the years it has been intensely ...
This book summarizes the state-of-the-art in partitional clustering. Clustering, the unsupervised cl...