The WWW contains a huge amount of documents. Some of them share the subject, but are generated by different people or even organizations. To guarantee the interchange of such documents, we can use XML. This allows to share documents that do not have the same structure. However, it makes difficult to understand the core of such heterogeneous documents (in general, schema is not available). In this paper, we offer a characterization and algorithm to obtain the midpoint (in terms of a resemblance function) of a set of semi-structured, heterogeneous documents without optional elements. The trivial case of midpoint would be the common elements to all documents. Nevertheless, in cases with several heterogeneous documents this may result in an emp...
Due to the heterogeneous nature of XML data for internet applications exact matching of queries is o...
A prime motivation for using XML to directly represent pieces of information is the ability of suppo...
XML is among the preferred formats for storing the structure of documents such as scientic articles,...
International audienceThe WWW contains a huge amount of documents. Some of them share the subject, b...
The WWW contains a huge amount of documents. Some of them share the subject, but are generated by di...
The WWW contains a huge amount of documents. Some of them share the same subject, but are generated ...
In this paper, we study the problem of measur-ing structural similarities of large number of source ...
While the world is witnessing an information revolution unprecedented and great speed in the growth ...
The integration of distributed, heterogeneous information sources has been the topic of intense inve...
This chapter discusses existing approaches to evaluate and measure structural similarity in sources ...
Due to the heterogeneous nature of XML data for internet applications exact matching of queries is o...
In this paper, we deal with the problem of effective search and query answering in heterogeneous web...
While the Internet has facilitated access to information sources, the task of scalable integration ...
This chapter discusses existing approaches to evaluate and measure structural similarity in sources ...
Abstract. The mathematical concept of document resemblance cap-tures well the informal notion of syn...
Due to the heterogeneous nature of XML data for internet applications exact matching of queries is o...
A prime motivation for using XML to directly represent pieces of information is the ability of suppo...
XML is among the preferred formats for storing the structure of documents such as scientic articles,...
International audienceThe WWW contains a huge amount of documents. Some of them share the subject, b...
The WWW contains a huge amount of documents. Some of them share the subject, but are generated by di...
The WWW contains a huge amount of documents. Some of them share the same subject, but are generated ...
In this paper, we study the problem of measur-ing structural similarities of large number of source ...
While the world is witnessing an information revolution unprecedented and great speed in the growth ...
The integration of distributed, heterogeneous information sources has been the topic of intense inve...
This chapter discusses existing approaches to evaluate and measure structural similarity in sources ...
Due to the heterogeneous nature of XML data for internet applications exact matching of queries is o...
In this paper, we deal with the problem of effective search and query answering in heterogeneous web...
While the Internet has facilitated access to information sources, the task of scalable integration ...
This chapter discusses existing approaches to evaluate and measure structural similarity in sources ...
Abstract. The mathematical concept of document resemblance cap-tures well the informal notion of syn...
Due to the heterogeneous nature of XML data for internet applications exact matching of queries is o...
A prime motivation for using XML to directly represent pieces of information is the ability of suppo...
XML is among the preferred formats for storing the structure of documents such as scientic articles,...