International audienceIn sketched clustering, a dataset of T samples is first sketched down to a vector of modest size, from which the centroids are subsequently extracted. Its advantages include 1) reduced storage complexity and 2) centroid extraction complexity independent of T. For the sketching methodology recently proposed by Keriven et al., which can be interpreted as a random sampling of the empirical characteristic function, we propose a sketched clustering algorithm based on approximate message passing. Numerical experiments suggest that our approach is more efficient than the state-of-the-art sketched clustering algorithm “CL-OMPR” (in both computational and sample complexity) and more efficient than k-means++ when T is large
National audienceWe explore the use of Optical Processing Units (OPU) to compute random Fourier feat...
We propose a sketch-based sampling algorithm, which effectively exploits the data sparsity. Sampling...
The traditional clustering algorithm, K-means, is famous for its simplicity and low time complexity....
International audienceIn sketched clustering, a dataset of T samples is first sketched down to a vec...
International audienceIn this paper, we address the problem of high-dimensional k-means clustering i...
International audienceThe Lloyd-Max algorithm is a classical approach to perform K-means clustering....
The quality of K-Means clustering is extremely sensitive to proper initialization. The classic remed...
Thesis: S.M. in Computer Science and Engineering, Massachusetts Institute of Technology, Department ...
A clustering algorithm that exploits special characteristics of a data set may lead to superior resu...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
It is an extended version of https://hal.inria.fr/hal-03350599 (official version published with DOI:...
Clustering has been one of the most widely studied topics in data mining and it is often the first s...
Spectral clustering is arguably one of the most important algorithms in data mining and machine inte...
Sketching is a drawing style where approximations and successive refinement in the drawing process a...
Advances in recent techniques for scientific data collection in the era of big data allow for the sy...
National audienceWe explore the use of Optical Processing Units (OPU) to compute random Fourier feat...
We propose a sketch-based sampling algorithm, which effectively exploits the data sparsity. Sampling...
The traditional clustering algorithm, K-means, is famous for its simplicity and low time complexity....
International audienceIn sketched clustering, a dataset of T samples is first sketched down to a vec...
International audienceIn this paper, we address the problem of high-dimensional k-means clustering i...
International audienceThe Lloyd-Max algorithm is a classical approach to perform K-means clustering....
The quality of K-Means clustering is extremely sensitive to proper initialization. The classic remed...
Thesis: S.M. in Computer Science and Engineering, Massachusetts Institute of Technology, Department ...
A clustering algorithm that exploits special characteristics of a data set may lead to superior resu...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
It is an extended version of https://hal.inria.fr/hal-03350599 (official version published with DOI:...
Clustering has been one of the most widely studied topics in data mining and it is often the first s...
Spectral clustering is arguably one of the most important algorithms in data mining and machine inte...
Sketching is a drawing style where approximations and successive refinement in the drawing process a...
Advances in recent techniques for scientific data collection in the era of big data allow for the sy...
National audienceWe explore the use of Optical Processing Units (OPU) to compute random Fourier feat...
We propose a sketch-based sampling algorithm, which effectively exploits the data sparsity. Sampling...
The traditional clustering algorithm, K-means, is famous for its simplicity and low time complexity....