International audienceData clustering is a well-known task in data mining and it often relies on distances or, in some cases, similarity measures. The latter is indeed the case for real world datasets that comprise categorical attributes. Several similarity measures have been proposed in the literature, however, their choice depends on the context and the dataset at hand. In this paper, we address the following question: given a set of measures, which one is best suited for clustering a particular dataset? We propose an approach to automate this choice, and we present an empirical study based on categorical datasets, on which we evaluate our proposed approach.Le partitionnement de données est une opération très utilisée dans l'exploration e...
Clustering categorical data is the major challenge in data mining. Direct comparison of categorical ...
Grouping objects that are described by attributes, or clustering is a central notion in data mining....
International audienceIn many domains, we face heterogeneous data with both numeric and categorical ...
Data clustering is a well-known task in data mining and it often relies on distances or, in some cas...
Data clustering is a well-known task in data mining and it often relies on distances or, in some cas...
Abstract. The concept of similarity is fundamentally important in al-most every scientific field. Cl...
Grouping objects that are described by attributes, or clustering is a central notion in data mining....
This paper introduces a measure of similarity between two clusterings of the same dataset produced b...
In clustering, one may be interested in the classification of similar objects into groups, and one m...
Data mining processes such as clustering, classification, regression and outlier detection are devel...
Clustering is the unsupervised classification of patterns (observations, data items, or feature vect...
Clustering is an unsupervised learning technique which aims at grouping a set of objects into cluste...
The problem of clustering has been widely studied in the context of data mining, where by grouping o...
The problem of clustering has been widely studied in the context of data mining, where by grouping o...
The problem of clustering has been widely studied in the context of data mining, where by grouping o...
Clustering categorical data is the major challenge in data mining. Direct comparison of categorical ...
Grouping objects that are described by attributes, or clustering is a central notion in data mining....
International audienceIn many domains, we face heterogeneous data with both numeric and categorical ...
Data clustering is a well-known task in data mining and it often relies on distances or, in some cas...
Data clustering is a well-known task in data mining and it often relies on distances or, in some cas...
Abstract. The concept of similarity is fundamentally important in al-most every scientific field. Cl...
Grouping objects that are described by attributes, or clustering is a central notion in data mining....
This paper introduces a measure of similarity between two clusterings of the same dataset produced b...
In clustering, one may be interested in the classification of similar objects into groups, and one m...
Data mining processes such as clustering, classification, regression and outlier detection are devel...
Clustering is the unsupervised classification of patterns (observations, data items, or feature vect...
Clustering is an unsupervised learning technique which aims at grouping a set of objects into cluste...
The problem of clustering has been widely studied in the context of data mining, where by grouping o...
The problem of clustering has been widely studied in the context of data mining, where by grouping o...
The problem of clustering has been widely studied in the context of data mining, where by grouping o...
Clustering categorical data is the major challenge in data mining. Direct comparison of categorical ...
Grouping objects that are described by attributes, or clustering is a central notion in data mining....
International audienceIn many domains, we face heterogeneous data with both numeric and categorical ...