My research is on theoretical foundations of machine learning. During graduate school, I primarily analyzed standard algorithms under practical assumptions, particularly in areas not covered by existing theory. While this was satisfying and the understanding it granted will color all my future work, during my postdoctoral studies I have been more interested in design of new algorithms. In this research statement I will exemplify these two phases with the following two groups of works. Theoretical analyses under practical assumptions. Consider clustering, the task of partitioning data into k groups according to some notion of similarity. Clustering is one of the oldest statistical tech-niques, and continues to be a tool of choice in modern d...
Cluster analysis is the study of how to partition data into homogeneous subsets so that the partitio...
Clustering is part of data mining where data mining is a process in which it is used to analyze data...
Clustering is a division of data into groups of similar objects. Representing the data by fewer clus...
We address the problem of communicating do-main knowledge from a user to the designer of a clusterin...
The general area of this research is data clustering, in which an unsupervised classification proces...
Researchers have discovered many successful algorithms and methodologies for solving problems at the...
ii Clustering involves partitioning a given data set into several groups based on some similarity/di...
A paradox for “k-means clustering” k-means objective φ of C = {ci, i ∈ [k]} on a dataset X: φX(C) = ...
Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, pri...
Recent improvements in machine learning methods have significantly advanced many fields in- cluding ...
There are many algorithms to cluster sample data points based on nearness or a similar-ity measure. ...
To classify objects based on their features and characteristics is one of the most important and pri...
Unsupervised learning is widely recognized as one of the most important challenges facing machine le...
Working with huge amount of data and learning from it by extracting useful information is one of the...
Abstract: Clustering is the assignment of data objects (records) into groups (called clusters) so th...
Cluster analysis is the study of how to partition data into homogeneous subsets so that the partitio...
Clustering is part of data mining where data mining is a process in which it is used to analyze data...
Clustering is a division of data into groups of similar objects. Representing the data by fewer clus...
We address the problem of communicating do-main knowledge from a user to the designer of a clusterin...
The general area of this research is data clustering, in which an unsupervised classification proces...
Researchers have discovered many successful algorithms and methodologies for solving problems at the...
ii Clustering involves partitioning a given data set into several groups based on some similarity/di...
A paradox for “k-means clustering” k-means objective φ of C = {ci, i ∈ [k]} on a dataset X: φX(C) = ...
Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, pri...
Recent improvements in machine learning methods have significantly advanced many fields in- cluding ...
There are many algorithms to cluster sample data points based on nearness or a similar-ity measure. ...
To classify objects based on their features and characteristics is one of the most important and pri...
Unsupervised learning is widely recognized as one of the most important challenges facing machine le...
Working with huge amount of data and learning from it by extracting useful information is one of the...
Abstract: Clustering is the assignment of data objects (records) into groups (called clusters) so th...
Cluster analysis is the study of how to partition data into homogeneous subsets so that the partitio...
Clustering is part of data mining where data mining is a process in which it is used to analyze data...
Clustering is a division of data into groups of similar objects. Representing the data by fewer clus...