Probabilistic modeling for data mining and machine learning problems is a fundamental research area. The general approach is to assume a generative model underlying the observed data, and estimate model parameters via likelihood maximization. It has the deep probability theory as the mathematical background, and enjoys a large amount of methods from statistical learning, sampling theory and Bayesian statistics. In this thesis we study several advanced probabilistic models for data clustering and feature projection, which are the two important unsupervised learning problems. The goal of clustering is to group similar data points together to uncover the data clusters. While numerous methods exist for various clustering tasks, one important q...
International audienceModel-based clustering is a popular tool which is renowned for its probabilist...
The first part of this thesis is concerned with Sparse Clustering, which assumes that a potentially ...
Clustering has been a subject of extensive research in data mining, pattern recognition, and other a...
Probabilistic modeling for data mining and machine learning problems is a fundamental research area....
Latent variable models are used extensively in unsupervised learning within the Bayesian paradigm, t...
34 pages, 11 figuresInternational audienceCount data is becoming more and more ubiquitous in a wide ...
One of the most important goals of unsupervised learning is to discover meaningful clusters in data....
A general inductive probabilistic framework for clustering and classification is introduced using th...
This work addresses the unsupervised classification issue for high-dimensional data by exploiting th...
In a society which produces and consumes an ever increasing amount of information, methods which can...
<p>Clustering methods are designed to separate heterogeneous data into groups of similar objects suc...
Classical model-based partitional clustering algorithms, such as k-means or mixture of Gaussians, pr...
We present a novel algorithm for agglomerative hierarchical clustering based on evaluating marginal ...
As one principal approach to machine learning and cognitive science, the probabilistic framework has...
textClustering is one of the most common data mining tasks, used frequently for data categorization...
International audienceModel-based clustering is a popular tool which is renowned for its probabilist...
The first part of this thesis is concerned with Sparse Clustering, which assumes that a potentially ...
Clustering has been a subject of extensive research in data mining, pattern recognition, and other a...
Probabilistic modeling for data mining and machine learning problems is a fundamental research area....
Latent variable models are used extensively in unsupervised learning within the Bayesian paradigm, t...
34 pages, 11 figuresInternational audienceCount data is becoming more and more ubiquitous in a wide ...
One of the most important goals of unsupervised learning is to discover meaningful clusters in data....
A general inductive probabilistic framework for clustering and classification is introduced using th...
This work addresses the unsupervised classification issue for high-dimensional data by exploiting th...
In a society which produces and consumes an ever increasing amount of information, methods which can...
<p>Clustering methods are designed to separate heterogeneous data into groups of similar objects suc...
Classical model-based partitional clustering algorithms, such as k-means or mixture of Gaussians, pr...
We present a novel algorithm for agglomerative hierarchical clustering based on evaluating marginal ...
As one principal approach to machine learning and cognitive science, the probabilistic framework has...
textClustering is one of the most common data mining tasks, used frequently for data categorization...
International audienceModel-based clustering is a popular tool which is renowned for its probabilist...
The first part of this thesis is concerned with Sparse Clustering, which assumes that a potentially ...
Clustering has been a subject of extensive research in data mining, pattern recognition, and other a...