We consider the problem of clustering data points in high dimensions, i.e. when the number of data points may be much smaller than the number of dimensions. Specifically, we consider a Gaussian mixture model (GMM) with non-spherical Gaussian components, where the clusters are distinguished by only a few relevant dimensions. The method we propose is a combination of a recent approach for learning parameters of a Gaus-sian mixture model and sparse linear discriminant analysis (LDA). In addition to cluster assignments, the method returns an estimate of the set of features relevant for clustering. Our results indicate that the sample complexity of clustering depends on the sparsity of the relevant feature set, while only scaling logarithmically...
Cluster analysis faces two problems in high dimensions: first, the “curse of di-mensionality ” that ...
A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data cont...
International audienceClustering in high-dimensional spaces is a recurrent problem in many domains, ...
The first part of this thesis is concerned with Sparse Clustering, which assumes that a potentially ...
<p>While several papers have investigated computationally and statistically efficient methods for le...
We propose a new Gaussian clustering method named EM-FDA for feature extraction in high dimensional ...
We propose a new Gaussian clustering method named EM-FDA for feature extraction in high dimensional ...
Variable selection is an important problem for cluster analysis of high-dimensional data. It is also...
Clustering in high-dimensional spaces is a difficult problem which is recurrent in many domains, for...
Clustering in high-dimensional spaces is a difficult problem which is recurrent in many domains, for...
We introduce a method for dimension reduction with clustering, classification, or discriminant analy...
In recent years we are witnessing to an increased attention towards methods for clustering matrix-v...
In recent years we are witnessing to an increased attention towards methods for clustering matrix-v...
Sparse clustering, which aims to find a proper partition of an extremely high-dimensional data set w...
Abstract. Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific d...
Cluster analysis faces two problems in high dimensions: first, the “curse of di-mensionality ” that ...
A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data cont...
International audienceClustering in high-dimensional spaces is a recurrent problem in many domains, ...
The first part of this thesis is concerned with Sparse Clustering, which assumes that a potentially ...
<p>While several papers have investigated computationally and statistically efficient methods for le...
We propose a new Gaussian clustering method named EM-FDA for feature extraction in high dimensional ...
We propose a new Gaussian clustering method named EM-FDA for feature extraction in high dimensional ...
Variable selection is an important problem for cluster analysis of high-dimensional data. It is also...
Clustering in high-dimensional spaces is a difficult problem which is recurrent in many domains, for...
Clustering in high-dimensional spaces is a difficult problem which is recurrent in many domains, for...
We introduce a method for dimension reduction with clustering, classification, or discriminant analy...
In recent years we are witnessing to an increased attention towards methods for clustering matrix-v...
In recent years we are witnessing to an increased attention towards methods for clustering matrix-v...
Sparse clustering, which aims to find a proper partition of an extremely high-dimensional data set w...
Abstract. Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific d...
Cluster analysis faces two problems in high dimensions: first, the “curse of di-mensionality ” that ...
A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data cont...
International audienceClustering in high-dimensional spaces is a recurrent problem in many domains, ...