Abstract—Dealing with real-life databases often implies handling sets of heterogeneous variables. We are proposing in this paper a methodology for exploring and analyzing such databases, with an application in the specific domain of healthcare data analytics. We are thus proposing a two-step heterogeneous finite mixture model, with a first step involving a joint mixture of Gaussian and multinomial distribution to handle numerical (i.e., real and integer numbers) and categorical variables (i.e., discrete values), and a second step featuring a mixture of hidden Markov models to handle sequences of categorical values (e.g., series of events). This approach is evaluated on a real-world application, the clustering of administrative health-care d...
The identification of different dynamics in sequential data has become an every day need in scientif...
Heterogeneity exists on a data set when samples from different classes are merged into the data set....
Within the field of data clustering, methods are commonly referred to as either 'distance-based' or ...
Analysis of medical data and making precise decisions by machine learning is emerging as a hot topic...
Big data in healthcare research is now common-place. The extraction of useful information on group s...
Cluster analysis seeks to identify homogeneous subgroups of cases in a population. This article prov...
This paper addresses the problem of clustering data when the available data measurements are not mul...
This paper addresses the problem of clustering data when the available data measurements are not mul...
In the present era of “Big Data”, data collection involving massive amount of features with a mix of...
In the present era of “Big Data”, data collection involving massive amount of features with a mix of...
Mixture models are a flexible tool for unsupervised clustering that have found popularity in a vast ...
In this paper a finite mixture model with a specific weights for each observation is introduced. The...
In this paper a finite mixture model with a specific weights for each observation is introduced. The...
In this paper we present the finite mixture models approach o clustering of high dimensional data. ...
Abstract: This paper’s purpose is twofold: first it addresses the adequacy of some theoretical infor...
The identification of different dynamics in sequential data has become an every day need in scientif...
Heterogeneity exists on a data set when samples from different classes are merged into the data set....
Within the field of data clustering, methods are commonly referred to as either 'distance-based' or ...
Analysis of medical data and making precise decisions by machine learning is emerging as a hot topic...
Big data in healthcare research is now common-place. The extraction of useful information on group s...
Cluster analysis seeks to identify homogeneous subgroups of cases in a population. This article prov...
This paper addresses the problem of clustering data when the available data measurements are not mul...
This paper addresses the problem of clustering data when the available data measurements are not mul...
In the present era of “Big Data”, data collection involving massive amount of features with a mix of...
In the present era of “Big Data”, data collection involving massive amount of features with a mix of...
Mixture models are a flexible tool for unsupervised clustering that have found popularity in a vast ...
In this paper a finite mixture model with a specific weights for each observation is introduced. The...
In this paper a finite mixture model with a specific weights for each observation is introduced. The...
In this paper we present the finite mixture models approach o clustering of high dimensional data. ...
Abstract: This paper’s purpose is twofold: first it addresses the adequacy of some theoretical infor...
The identification of different dynamics in sequential data has become an every day need in scientif...
Heterogeneity exists on a data set when samples from different classes are merged into the data set....
Within the field of data clustering, methods are commonly referred to as either 'distance-based' or ...