This thesis introduces DescribeX, a powerful framework that is capable of describing arbitrarily complex XML summaries of web collections, providing support for more efficient evaluation of XPath workloads. DescribeX permits the declarative description of document structure using all axes and language constructs in XPath, and generalizes many of the XML indexing and summarization approaches in the literature. DescribeX supports the construction of heterogeneous summaries where different document elements sharing a common structure can be declaratively defined and refined by means of path regular expressions on axes, or axis path regular expression (AxPREs). DescribeX can significantly help in the understanding of both the structure of compl...
XML (Extensible Mark-up Language) has been recently understood as a new approach to data modelling. ...
This paper presents the advantages of combining multiple document representation schemes for query p...
This paper presents a methodology to support approximate queries over massive and heterogeneousXMLda...
The nature of semistructured data in web collections is evolving. Even when XML web documents are va...
Several web applications (such as processing RSS feeds or web service messages) rely on XPath-based ...
Abstract — This paper introduces AxPRE summaries, a formalism that allows exploring the (semi-)struc...
We present DescribeX, a tool for exploring and visualizing the structural patterns present in collec...
In the last few years several repositories for storing XML documents and languages for querying XML ...
Summarization: The publish/subscribe paradigm is a popular model for allowing publishers (i.e., data...
We tackle the problem of obtaining statistics on content and structure of XML documents by using sum...
Summarization: We tackle the difficult problem of summarizing the path/branching structure and value...
The Extensible Markup Language (XML) is extremely popular as a generic markup language for text docu...
XML is a rather verbose representation of semistructured data, which may require huge amounts of sto...
International audienceBusiness Intelligence plays an important role in decision making. Based on dat...
We present a new approach for accelerating the execution of XPath expressions using parameterized ma...
XML (Extensible Mark-up Language) has been recently understood as a new approach to data modelling. ...
This paper presents the advantages of combining multiple document representation schemes for query p...
This paper presents a methodology to support approximate queries over massive and heterogeneousXMLda...
The nature of semistructured data in web collections is evolving. Even when XML web documents are va...
Several web applications (such as processing RSS feeds or web service messages) rely on XPath-based ...
Abstract — This paper introduces AxPRE summaries, a formalism that allows exploring the (semi-)struc...
We present DescribeX, a tool for exploring and visualizing the structural patterns present in collec...
In the last few years several repositories for storing XML documents and languages for querying XML ...
Summarization: The publish/subscribe paradigm is a popular model for allowing publishers (i.e., data...
We tackle the problem of obtaining statistics on content and structure of XML documents by using sum...
Summarization: We tackle the difficult problem of summarizing the path/branching structure and value...
The Extensible Markup Language (XML) is extremely popular as a generic markup language for text docu...
XML is a rather verbose representation of semistructured data, which may require huge amounts of sto...
International audienceBusiness Intelligence plays an important role in decision making. Based on dat...
We present a new approach for accelerating the execution of XPath expressions using parameterized ma...
XML (Extensible Mark-up Language) has been recently understood as a new approach to data modelling. ...
This paper presents the advantages of combining multiple document representation schemes for query p...
This paper presents a methodology to support approximate queries over massive and heterogeneousXMLda...