(postprint); This version corrects a couple of errors in authors' names in the bibliography./www.jucs.orgThis paper presents some experiments in clustering homogeneous XMLdocuments to validate an existing classification or more generally anorganisational structure. Our approach integrates techniques for extracting knowledge from documents with unsupervised classification (clustering) of documents. We focus on the feature selection used for representing documents and its impact on the emerging classification. We mix the selection of structured features with fine textual selection based on syntactic characteristics.We illustrate and evaluate this approach with a collection of Inria activity reports for the year 2003. The objective is to clust...
This paper presents the incremental clustering algorithm, XML documents Clustering with Level Simila...
Abstract. Every day more digital data in semi-structured format are available on the World Wide Web,...
With the standardization of XML as an information exchange language over the net, a huge amount of i...
International audienceThis article is a report concerning the two years of the XML Mining track at I...
In the last few years we have observed a proliferation of approaches for clustering XML docu- ments ...
The XML Document Mining track was launched for exploring two main ideas: (1) identifying key problem...
In the last few years we have observed a proliferation of approaches for clustering XML documents an...
This article is a report concerning the two years of the XML Mining track at INEX (2005 and 2006). W...
This paper reports on the experiments and results of a clustering approach used in the INEX 2008 doc...
This is the authors' version. To access the final version go to the editor's site through the DOI./h...
International audienceThis paper reports our experiments carried out for the INEX XML Mining track, ...
This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queen...
XML is becoming increasingly popular as a language for representing many types of electronic documen...
XML documents are becoming ubiquitous because of their rich and flexible format that can be used for...
With the increasing use of XML in many domains, XML document clustering has been a central research ...
This paper presents the incremental clustering algorithm, XML documents Clustering with Level Simila...
Abstract. Every day more digital data in semi-structured format are available on the World Wide Web,...
With the standardization of XML as an information exchange language over the net, a huge amount of i...
International audienceThis article is a report concerning the two years of the XML Mining track at I...
In the last few years we have observed a proliferation of approaches for clustering XML docu- ments ...
The XML Document Mining track was launched for exploring two main ideas: (1) identifying key problem...
In the last few years we have observed a proliferation of approaches for clustering XML documents an...
This article is a report concerning the two years of the XML Mining track at INEX (2005 and 2006). W...
This paper reports on the experiments and results of a clustering approach used in the INEX 2008 doc...
This is the authors' version. To access the final version go to the editor's site through the DOI./h...
International audienceThis paper reports our experiments carried out for the INEX XML Mining track, ...
This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queen...
XML is becoming increasingly popular as a language for representing many types of electronic documen...
XML documents are becoming ubiquitous because of their rich and flexible format that can be used for...
With the increasing use of XML in many domains, XML document clustering has been a central research ...
This paper presents the incremental clustering algorithm, XML documents Clustering with Level Simila...
Abstract. Every day more digital data in semi-structured format are available on the World Wide Web,...
With the standardization of XML as an information exchange language over the net, a huge amount of i...