In data integration we transform information from a source into a target schema. A general problem in this task is loss of fidelity and coverage: the source expresses more knowledge than can fit into the target schema, or knowledge that is hard to fit into any schema at all. This problem is taken to an extreme in information extraction (IE) where the source is natural language. To address this issue, one can either automatically learn a latent schema emergent in text (a brittle and ill-defined task), or manually extend schemas. We propose instead to store data in a probabilistic database of universal schema. This schema is simply the union of all source schemas, and the probabilistic database learns how to predict the cells of each source r...
Databases constructed automatically through web mining and information extraction often overlap with...
We propose the Schema-Model Framework, which characterizes algorithms that learn probabilistic model...
Schema matching is the problem of finding relationships among concepts across data sources that are ...
In data integration we transform information from a source into a target schema. A general problem i...
In data integration we transform information from a source into a target schema. A general problem i...
IEEE We propose a probabilistic approach to the problem of schema mapping. Our approach is declarati...
Over the recent past, information extraction (IE) systems such as Nell and ReVerb have attained much...
We propose a new declarative approach to schema mapping discovery, that is, the task of identifying ...
The goal of a mediator system is to provide users a uniform interface to the multitude of informatio...
One of the core problems in soft computing is dealing with uncertainty in data. In this paper, we re...
This paper reports our first set of results on managing uncertainty in data integration. We posit th...
Probabilistic description logic programs are a powerful tool for knowledge representation in the Sem...
We propose a new kind of probabilistic programming language for machine learning. We write programs ...
Probabilistic description logic programs are a powerful tool for knowledge representation in the Sem...
Schema matching is the problem of finding correspondences (mapping rules, e.g. logical formulae) bet...
Databases constructed automatically through web mining and information extraction often overlap with...
We propose the Schema-Model Framework, which characterizes algorithms that learn probabilistic model...
Schema matching is the problem of finding relationships among concepts across data sources that are ...
In data integration we transform information from a source into a target schema. A general problem i...
In data integration we transform information from a source into a target schema. A general problem i...
IEEE We propose a probabilistic approach to the problem of schema mapping. Our approach is declarati...
Over the recent past, information extraction (IE) systems such as Nell and ReVerb have attained much...
We propose a new declarative approach to schema mapping discovery, that is, the task of identifying ...
The goal of a mediator system is to provide users a uniform interface to the multitude of informatio...
One of the core problems in soft computing is dealing with uncertainty in data. In this paper, we re...
This paper reports our first set of results on managing uncertainty in data integration. We posit th...
Probabilistic description logic programs are a powerful tool for knowledge representation in the Sem...
We propose a new kind of probabilistic programming language for machine learning. We write programs ...
Probabilistic description logic programs are a powerful tool for knowledge representation in the Sem...
Schema matching is the problem of finding correspondences (mapping rules, e.g. logical formulae) bet...
Databases constructed automatically through web mining and information extraction often overlap with...
We propose the Schema-Model Framework, which characterizes algorithms that learn probabilistic model...
Schema matching is the problem of finding relationships among concepts across data sources that are ...