This dissertation in a broad sense focuses on understanding the fundamental aspects of building a large-scale information integration system that can answer complex queries over a large number of heterogeneous Internet data sources. Among many challenges in achieving this goal, we focus on two key issues: efficient query processing and schema matching. Most of the data the integration system processes arrives in a stream from a remote source rather than residing on a local disk; we need to develop efficient query processing algorithms that work in this environment. We specifically investigate algorithms for evaluating sliding window joins over pairs of unbounded streams. We introduce a unit-time-basis cost model to analyze the expected perf...
Information and data integration focuses on providing an integrated view of multiple distributed and...
Thesis (Ph.D.)--University of Washington, 2021As the demand for data intensive pipelines has grown a...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
The rapid growth of distributed data has fueled significant interest in building data integration sy...
The WWW is considered as a collection of heterogeneous information sources available online. However...
In data integration systems, queries posed to a mediator need to be translated into a sequence of qu...
Today schema matching is a basic task in almost every data intensive distributed application, namely...
112 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2006.This dissertation presents Ic...
The goal of a data integration system is to allow users to query diverse information sources through...
Information integration over heterogeneous data sources in the Internet environment is a major conce...
Abstract—An ever increasing amount of valuable information is stored in web databases, "hidden ...
This dissertation focuses on several important problems in query answering over a single or multi-so...
Data integration systems offer users a uniform interface to a set of data sources. Previous work has...
This dissertation provides an ad hoc integration methodology to manage and integrate heterogeneous o...
We present an unsupervised approach for harvesting the data ex-posed by a set of structured and part...
Information and data integration focuses on providing an integrated view of multiple distributed and...
Thesis (Ph.D.)--University of Washington, 2021As the demand for data intensive pipelines has grown a...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...
The rapid growth of distributed data has fueled significant interest in building data integration sy...
The WWW is considered as a collection of heterogeneous information sources available online. However...
In data integration systems, queries posed to a mediator need to be translated into a sequence of qu...
Today schema matching is a basic task in almost every data intensive distributed application, namely...
112 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2006.This dissertation presents Ic...
The goal of a data integration system is to allow users to query diverse information sources through...
Information integration over heterogeneous data sources in the Internet environment is a major conce...
Abstract—An ever increasing amount of valuable information is stored in web databases, "hidden ...
This dissertation focuses on several important problems in query answering over a single or multi-so...
Data integration systems offer users a uniform interface to a set of data sources. Previous work has...
This dissertation provides an ad hoc integration methodology to manage and integrate heterogeneous o...
We present an unsupervised approach for harvesting the data ex-posed by a set of structured and part...
Information and data integration focuses on providing an integrated view of multiple distributed and...
Thesis (Ph.D.)--University of Washington, 2021As the demand for data intensive pipelines has grown a...
With the proliferation of the RDF data format, engines for RDF query processing are faced with very ...