International audienceIn sketched clustering, a dataset of T samples is first sketched down to a vector of modest size, from which the centroids are subsequently extracted. Its advantages include 1) reduced storage complexity and 2) centroid extraction complexity independent of T. For the sketching methodology recently proposed by Keriven et al., which can be interpreted as a random sampling of the empirical characteristic function, we propose a sketched clustering algorithm based on approximate message passing. Numerical experiments suggest that our approach is more efficient than the state-of-the-art sketched clustering algorithm “CL-OMPR” (in both computational and sample complexity) and more efficient than k-means++ when T is large
A clustering algorithm that exploits special characteristics of a data set may lead to superior resu...
The k-means clustering algorithm, whilst widely popular, is not without its drawbacks. In this paper...
Cluster analysis is one of the primary data analysis methods and k-means is one of the most well kno...
International audienceIn sketched clustering, the dataset is first sketched down to a vector of mode...
International audienceThe Lloyd-Max algorithm is a classical approach to perform K-means clustering....
International audienceIn this paper, we address the problem of high-dimensional k-means clustering i...
National audienceWe explore the use of Optical Processing Units (OPU) to compute random Fourier feat...
University of Minnesota M.S.E.E. thesis. August 2015. Major: Electrical Engineering. Advisor: Georgi...
International audienceSpectral clustering refers to a family of well-known unsupervised learning alg...
It is an extended version of https://hal.inria.fr/hal-03350599 (official version published with DOI:...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
International audienceLarge scale duplicate detection, clustering and mining of documents or images ...
International audienceWe present a method to solve large-scale mixture learning tasks from a sketch ...
Clustering has been one of the most widely studied topics in data mining and it is often the first s...
For an extended version of this article that contains additional references and more in-depth discus...
A clustering algorithm that exploits special characteristics of a data set may lead to superior resu...
The k-means clustering algorithm, whilst widely popular, is not without its drawbacks. In this paper...
Cluster analysis is one of the primary data analysis methods and k-means is one of the most well kno...
International audienceIn sketched clustering, the dataset is first sketched down to a vector of mode...
International audienceThe Lloyd-Max algorithm is a classical approach to perform K-means clustering....
International audienceIn this paper, we address the problem of high-dimensional k-means clustering i...
National audienceWe explore the use of Optical Processing Units (OPU) to compute random Fourier feat...
University of Minnesota M.S.E.E. thesis. August 2015. Major: Electrical Engineering. Advisor: Georgi...
International audienceSpectral clustering refers to a family of well-known unsupervised learning alg...
It is an extended version of https://hal.inria.fr/hal-03350599 (official version published with DOI:...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
International audienceLarge scale duplicate detection, clustering and mining of documents or images ...
International audienceWe present a method to solve large-scale mixture learning tasks from a sketch ...
Clustering has been one of the most widely studied topics in data mining and it is often the first s...
For an extended version of this article that contains additional references and more in-depth discus...
A clustering algorithm that exploits special characteristics of a data set may lead to superior resu...
The k-means clustering algorithm, whilst widely popular, is not without its drawbacks. In this paper...
Cluster analysis is one of the primary data analysis methods and k-means is one of the most well kno...