In this paper, we study the notion of entropy for a set of attributes of a table and propose a novel method to measure the dissimilarity of categorical data. Experiments show that our estimation method improves the accuracy of the popular unsu- pervised Self Organized Map (SOM), in comparison to Euclidean or Mahalanobis distance. The distance comparison is applied for clustering of multidimensional contingency tables. Two factors make our distance function attractive: first, the general framework which can be extended to other class of problems; second, we may normalize this measure in order to obtain a coefficient similar for instance to the Pearson's coefficient of contingency
The extension of sample entropy methodologies to multivariate signals has received considerable atte...
This paper deals with the measurement of entropy when an indistinguishability relation on the set of...
The cosine or correlation measures of similarity used to cluster high dimensional data are interpret...
Conventional clustering algorithms are restricted for use with data containing ratio or interval sca...
A successful attempt in exploring a dissimilarity measure which captures the reality is made in this...
In this paper we propose a new index Z for measuring the dissimilaritybetween two hierarchical clust...
In this paper we propose a new index Z for measuring the dissimilarity between two hierarchical clus...
We propose a novel hierarchical clustering for distribution valued dissimilarities. Analysis of larg...
AbstractClustering is the process of organizing dataset into isolated groups such that data points i...
ISBN : 978-1-59904-849-9 ; 11 pagesAdaptation of the Self-Organizing Map to dissimilarity data is of...
The development of analysis methods for categorical data begun in 90's decade, and it has been boomi...
Many mixed datasets with both numerical and categorical attributes have been collected in various fi...
International audienceWe present a new algorithm capable of partitioning sets of objects by taking s...
We show a new metric for comparing unordered, tree-structured data. While such data is increasingly ...
This paper proposes a new measure for similarity between basket datasets. The new measure is calcula...
The extension of sample entropy methodologies to multivariate signals has received considerable atte...
This paper deals with the measurement of entropy when an indistinguishability relation on the set of...
The cosine or correlation measures of similarity used to cluster high dimensional data are interpret...
Conventional clustering algorithms are restricted for use with data containing ratio or interval sca...
A successful attempt in exploring a dissimilarity measure which captures the reality is made in this...
In this paper we propose a new index Z for measuring the dissimilaritybetween two hierarchical clust...
In this paper we propose a new index Z for measuring the dissimilarity between two hierarchical clus...
We propose a novel hierarchical clustering for distribution valued dissimilarities. Analysis of larg...
AbstractClustering is the process of organizing dataset into isolated groups such that data points i...
ISBN : 978-1-59904-849-9 ; 11 pagesAdaptation of the Self-Organizing Map to dissimilarity data is of...
The development of analysis methods for categorical data begun in 90's decade, and it has been boomi...
Many mixed datasets with both numerical and categorical attributes have been collected in various fi...
International audienceWe present a new algorithm capable of partitioning sets of objects by taking s...
We show a new metric for comparing unordered, tree-structured data. While such data is increasingly ...
This paper proposes a new measure for similarity between basket datasets. The new measure is calcula...
The extension of sample entropy methodologies to multivariate signals has received considerable atte...
This paper deals with the measurement of entropy when an indistinguishability relation on the set of...
The cosine or correlation measures of similarity used to cluster high dimensional data are interpret...