We study estimation of mixture models for problems in which multiple views of the instances are available. Examples of this setting include clustering web pages or research papers that have intrinsic (text) and extrinsic (references) attributes. Our optimization criterion quantifies the likelihood and the consensus among models in the individual views; maximizing this consensus minimizes a bound on the risk of assigning an instance to an incorrect mixture component. We derive an algorithm that maximizes this criterion. Empirically, we observe that the resulting clustering method incurs a lower cluster entropy than regular EM for web pages, research papers, and many text collections. 1
International audienceThis paper deals with nonparametric estimation of conditional den-sities in mi...
It is shown that for finite mixtures the missing information tends to zero as the number of observat...
International audienceMixtures of von Mises-Fisher distributions can be used to cluster data on the ...
Abstract. The first part of this abstract focuses on estimation of mixture models for problems in wh...
The first part of this abstract focuses on estimation of mixture models for problems in which multip...
Address email Clustering is often formulated as the maximum likelihood estimation of a mixture model...
Due to a potentially high number of parameters, finite mixture models are often at the risk of overp...
In this dissertation, we propose several methodology in clustering and mixture modeling when the use...
In this study, we consider unsupervised clustering of categorical vectors that can be of different s...
Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introductio...
In Chapter 1 we give a general introduction and motivate the need for clustering and dimension reduc...
This note is completely expository, and contains a whirlwind abridged introduction to the topic of m...
Modeling and predicting co-occurrences of events is a fundamental problem of unsupervised learning. ...
Finite mixture models are being increasingly used to model the distributions of a wide variety of ra...
The Expectation–Maximization (EM) algorithm is a popular tool in a wide variety of statistical setti...
International audienceThis paper deals with nonparametric estimation of conditional den-sities in mi...
It is shown that for finite mixtures the missing information tends to zero as the number of observat...
International audienceMixtures of von Mises-Fisher distributions can be used to cluster data on the ...
Abstract. The first part of this abstract focuses on estimation of mixture models for problems in wh...
The first part of this abstract focuses on estimation of mixture models for problems in which multip...
Address email Clustering is often formulated as the maximum likelihood estimation of a mixture model...
Due to a potentially high number of parameters, finite mixture models are often at the risk of overp...
In this dissertation, we propose several methodology in clustering and mixture modeling when the use...
In this study, we consider unsupervised clustering of categorical vectors that can be of different s...
Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introductio...
In Chapter 1 we give a general introduction and motivate the need for clustering and dimension reduc...
This note is completely expository, and contains a whirlwind abridged introduction to the topic of m...
Modeling and predicting co-occurrences of events is a fundamental problem of unsupervised learning. ...
Finite mixture models are being increasingly used to model the distributions of a wide variety of ra...
The Expectation–Maximization (EM) algorithm is a popular tool in a wide variety of statistical setti...
International audienceThis paper deals with nonparametric estimation of conditional den-sities in mi...
It is shown that for finite mixtures the missing information tends to zero as the number of observat...
International audienceMixtures of von Mises-Fisher distributions can be used to cluster data on the ...