We present DescribeX, a tool for exploring and visualizing the structural patterns present in collections of XML documents. DescribeX can be employed by developers to interactively discover those XPath expressions that will actually return elements present in a collection of XML files. The element structure of many collections of XML documents present in the Web can be fairly unpredictable. This is the case even when the documents are validated by a schema, and can happen for two main reasons. First, the documents may follow a schema that allows many elements to occur almost anywhere in the document (e.g., by extensive use of <xsd:choice> in XML schema). Second, the default namespace and corresponding schema can be extended by incorpo...
International audienceThe distributed nature of the Web, as a decentralized system exchanging inform...
The common abstraction of XML Schema by unranked regular tree languages is not entirely accurate. To...
XML datasets of various sizes and properties are needed to evaluate the correctness and efficiency o...
The nature of semistructured data in web collections is evolving. Even when XML web documents are va...
Evaluating collections of XML documents without paying attention to the schema they were written in ...
Abstract. Xml Schemas provide a generalization of Document Type Definitions for describing the valid...
Abstract — This paper introduces AxPRE summaries, a formalism that allows exploring the (semi-)struc...
Several web applications (such as processing RSS feeds or web service messages) rely on XPath-based ...
The eXtensible Markup Language (XML) is fast emerging as the dominant standard for storing, describi...
Although the presence of a schema enables many optimizations for operations on XML documents, recent...
XML Schema is one of the most used specifications for defining types of XML documents. It provides...
Object-Oriented (OO) conceptual modelling offers the power in describing and modelling real-word dat...
XML is still more and more important format for storing and exchanging data. In the face of this ten...
XML is among the preferred formats for storing the structure of documents such as scientic articles,...
The eXtensible Markup Language (XML) is fast emerging as the dominant standard for describing and in...
International audienceThe distributed nature of the Web, as a decentralized system exchanging inform...
The common abstraction of XML Schema by unranked regular tree languages is not entirely accurate. To...
XML datasets of various sizes and properties are needed to evaluate the correctness and efficiency o...
The nature of semistructured data in web collections is evolving. Even when XML web documents are va...
Evaluating collections of XML documents without paying attention to the schema they were written in ...
Abstract. Xml Schemas provide a generalization of Document Type Definitions for describing the valid...
Abstract — This paper introduces AxPRE summaries, a formalism that allows exploring the (semi-)struc...
Several web applications (such as processing RSS feeds or web service messages) rely on XPath-based ...
The eXtensible Markup Language (XML) is fast emerging as the dominant standard for storing, describi...
Although the presence of a schema enables many optimizations for operations on XML documents, recent...
XML Schema is one of the most used specifications for defining types of XML documents. It provides...
Object-Oriented (OO) conceptual modelling offers the power in describing and modelling real-word dat...
XML is still more and more important format for storing and exchanging data. In the face of this ten...
XML is among the preferred formats for storing the structure of documents such as scientic articles,...
The eXtensible Markup Language (XML) is fast emerging as the dominant standard for describing and in...
International audienceThe distributed nature of the Web, as a decentralized system exchanging inform...
The common abstraction of XML Schema by unranked regular tree languages is not entirely accurate. To...
XML datasets of various sizes and properties are needed to evaluate the correctness and efficiency o...