Abstract The amount of data available in the world is growing faster than our ability to deal with it. However, if we take advantage of the internal structure, data may become much smaller for machine learning purposes. In this paper we focus on one of the fundamental machine learning tasks, empirical risk minimization (ERM), and provide faster algorithms with the help from the clustering structure of the data. We introduce a simple notion of raw clustering that can be efficiently computed from the data, and propose two algorithms based on clustering information. Our accelerated algorithm ClusterACDM is built on a novel Haar transformation applied to the dual space of the ERM problem, and our variance-reduction based algorithm ClusterSVRG i...
Address email Clustering is often formulated as the maximum likelihood estimation of a mixture model...
International audienceClustering with fast algorithms large samples of high dimensional data is an i...
This work considers optimization methods for large-scale machine learning (ML). Optimization in ML ...
Empirical risk minimization (ERM) problems express optimal classifiers as solutions of optimization ...
Empirical risk minimization (ERM) problems express optimal classifiers as solutions of optimization ...
In many learning problems, ranging from clustering to ranking through metric learning, empirical est...
This dissertation broadly focuses on developing robust machine learning and optimization approaches ...
International audienceIn many learning problems, ranging from clustering to ranking through metric l...
We study the problem of clustering a set of data points based on their similarity matrix, each entry...
The cluster analysis of real-life data often encounters the challenges of noisy data or may rely hea...
A new approach to clustering multivariate data, based on a multilevel linear mixed model, is propose...
This research investigates the effectiveness of a non-convex clustering criterion with the ability t...
In many real-world clustering problems, there usually exist little information about the clusters un...
In this paper, we develop a stochastic-gradient learning algorithm for situations involving streamin...
Modern machine learning systems pose several new statistical, scalability, privacy and ethical chall...
Address email Clustering is often formulated as the maximum likelihood estimation of a mixture model...
International audienceClustering with fast algorithms large samples of high dimensional data is an i...
This work considers optimization methods for large-scale machine learning (ML). Optimization in ML ...
Empirical risk minimization (ERM) problems express optimal classifiers as solutions of optimization ...
Empirical risk minimization (ERM) problems express optimal classifiers as solutions of optimization ...
In many learning problems, ranging from clustering to ranking through metric learning, empirical est...
This dissertation broadly focuses on developing robust machine learning and optimization approaches ...
International audienceIn many learning problems, ranging from clustering to ranking through metric l...
We study the problem of clustering a set of data points based on their similarity matrix, each entry...
The cluster analysis of real-life data often encounters the challenges of noisy data or may rely hea...
A new approach to clustering multivariate data, based on a multilevel linear mixed model, is propose...
This research investigates the effectiveness of a non-convex clustering criterion with the ability t...
In many real-world clustering problems, there usually exist little information about the clusters un...
In this paper, we develop a stochastic-gradient learning algorithm for situations involving streamin...
Modern machine learning systems pose several new statistical, scalability, privacy and ethical chall...
Address email Clustering is often formulated as the maximum likelihood estimation of a mixture model...
International audienceClustering with fast algorithms large samples of high dimensional data is an i...
This work considers optimization methods for large-scale machine learning (ML). Optimization in ML ...