This is the authors' version. To access the final version go to the editor's site through the DOI./http://www.springerlink.comThis paper reports on the INRIA group's approach to XML mining while participating in the INEX XML Mining track 2005. We use a flexible representation of XML documents that allows taking into account the structure only or both the structure and content. Our approach consists of representing XML documents by a set of their sub-paths, defined according to some criteria (length, root beginning, leaf ending). By considering those sub-paths as words, we can use standard methods for vocabulary reduction, and simple clustering methods such as K-means that scale well. We actually use an implementation of the clustering algor...
This article is a report concerning the two years of the XML Mining track at INEX (2005 and 2006). W...
The processing and management of XML data are popular research issues. However, operations based on ...
(postprint); This version corrects a couple of errors in authors' names in the bibliography./www.juc...
XML documents are becoming ubiquitous because of their rich and flexible format that can be used for...
International audienceThis article is a report concerning the two years of the XML Mining track at I...
With the standardization of XML as an information exchange language over the net, a huge amount of i...
Abstract—With the standardization of XML as an information exchange language over the net, a huge am...
With the increasing use of XML in many domains, XML document clustering has been a central research ...
http://www.iaeng.org/publication/IMECS2011/IMECS2011_pp378-381.pdfInternational audienceThis work pr...
The XML Document Mining track was launched for exploring two main ideas: (1) identifying key problem...
XML document clustering is essential for many document handling applications such as information sto...
The role of the eXtensible Markup Language (XML) is becoming very important in the research fields f...
This paper reports on the experiments and results of a clustering approach used in the INEX 2008 doc...
We propose a methodology for clustering XMLdocuments on the basis of their structuralsimilarities. T...
XML is becoming increasingly popular as a language for representing many types of electronic documen...
This article is a report concerning the two years of the XML Mining track at INEX (2005 and 2006). W...
The processing and management of XML data are popular research issues. However, operations based on ...
(postprint); This version corrects a couple of errors in authors' names in the bibliography./www.juc...
XML documents are becoming ubiquitous because of their rich and flexible format that can be used for...
International audienceThis article is a report concerning the two years of the XML Mining track at I...
With the standardization of XML as an information exchange language over the net, a huge amount of i...
Abstract—With the standardization of XML as an information exchange language over the net, a huge am...
With the increasing use of XML in many domains, XML document clustering has been a central research ...
http://www.iaeng.org/publication/IMECS2011/IMECS2011_pp378-381.pdfInternational audienceThis work pr...
The XML Document Mining track was launched for exploring two main ideas: (1) identifying key problem...
XML document clustering is essential for many document handling applications such as information sto...
The role of the eXtensible Markup Language (XML) is becoming very important in the research fields f...
This paper reports on the experiments and results of a clustering approach used in the INEX 2008 doc...
We propose a methodology for clustering XMLdocuments on the basis of their structuralsimilarities. T...
XML is becoming increasingly popular as a language for representing many types of electronic documen...
This article is a report concerning the two years of the XML Mining track at INEX (2005 and 2006). W...
The processing and management of XML data are popular research issues. However, operations based on ...
(postprint); This version corrects a couple of errors in authors' names in the bibliography./www.juc...