The focus of this thesis is models for non-parametric clustering of multivariate count data. While there has been significant work in Bayesian non-parametric modelling in the last decade, in the context of mixture models for real-valued data and some forms of discrete data such as multinomial-mixtures, there has been much less work on non-parametric clustering of Multi-variate Count Data. The main challenges in clustering multivariate counts include choosing a suitable multivariate distribution that adequately captures the properties of the data, for instance handling over-dispersed data or sparse multivariate data, at the same time leveraging the inherent dependency structure between dimensions and across instances to get meaningful cluste...
34 pages, 11 figuresInternational audienceCount data is becoming more and more ubiquitous in a wide ...
© 2015 IEEE. We present a novel non-parametric clustering model using Gaussian mixture model (NHCM)....
Finite mixtures are applied to perform model-based clustering of multivariate data. Existing models ...
Latent variable models are used extensively in unsupervised learning within the Bayesian paradigm, t...
A useful step in data analysis is clustering, in which observations are grouped together in a hopefu...
We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional ...
The density-based formulation aims at recasting the clustering problem to a mathematically sound fra...
In the Bayesian nonparametric family, Dirichlet Process (DP) is a prior distribution that is able to...
31 pages, 10 figuresCount data is becoming more and more ubiquitous in a wide range of applications,...
Time series data may exhibit clustering over time and, in a multiple time series context, the clust...
The first part of this thesis is concerned with Sparse Clustering, which assumes that a potentially ...
The Dirichlet process mixture (DPM) model, a typical Bayesian nonparametric model, can infer the num...
Bayesian nonparametric mixture models are often employed for modelling complex data. While these mod...
Factor analysis models effectively summarise the covariance structure of high dimensional data, but ...
Copulas enable flexible parameterization of multivariate distributions in terms of constituent margi...
34 pages, 11 figuresInternational audienceCount data is becoming more and more ubiquitous in a wide ...
© 2015 IEEE. We present a novel non-parametric clustering model using Gaussian mixture model (NHCM)....
Finite mixtures are applied to perform model-based clustering of multivariate data. Existing models ...
Latent variable models are used extensively in unsupervised learning within the Bayesian paradigm, t...
A useful step in data analysis is clustering, in which observations are grouped together in a hopefu...
We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional ...
The density-based formulation aims at recasting the clustering problem to a mathematically sound fra...
In the Bayesian nonparametric family, Dirichlet Process (DP) is a prior distribution that is able to...
31 pages, 10 figuresCount data is becoming more and more ubiquitous in a wide range of applications,...
Time series data may exhibit clustering over time and, in a multiple time series context, the clust...
The first part of this thesis is concerned with Sparse Clustering, which assumes that a potentially ...
The Dirichlet process mixture (DPM) model, a typical Bayesian nonparametric model, can infer the num...
Bayesian nonparametric mixture models are often employed for modelling complex data. While these mod...
Factor analysis models effectively summarise the covariance structure of high dimensional data, but ...
Copulas enable flexible parameterization of multivariate distributions in terms of constituent margi...
34 pages, 11 figuresInternational audienceCount data is becoming more and more ubiquitous in a wide ...
© 2015 IEEE. We present a novel non-parametric clustering model using Gaussian mixture model (NHCM)....
Finite mixtures are applied to perform model-based clustering of multivariate data. Existing models ...