Abstract — Collected data often contains uncertainties. Prob-abilistic databases have been proposed to manage uncertain data. To combine data from multiple autonomous probabilistic databases, an integration of probabilistic data has to be per-formed. Until now, however, data integration approaches have focused on the integration of certain source data (relational or XML). There is no work on the integration of uncertain (esp. probabilistic) source data so far. In this paper, we present a first step towards a concise consolidation of probabilistic data. We focus on duplicate detection as a representative and essential step in an integration process. We present techniques for identifying multiple probabilistic representations of the same real...
The problem of identifying approximately duplicate records between databases is known, among others,...
In this paper, we propose a novel, effective and efficient probabilistic pruning criterion for proba...
Sources of data uncertainty and imprecision are numerous. A way to handle this uncertainty is to ass...
Collected data often contains uncertainties. Probabilistic databases have been proposed to manage un...
Abstract A major source of uncertainty in databases is the presence of duplicate items, i.e., record...
1 Introduction Finding duplicate records in one, or linking records from several data sets areincrea...
Probabilistic data integration is a specific kind of data integration where integration problems suc...
Databases constructed automatically through web mining and information extraction often overlap with...
In current research, duplicate detection is usually considered as a deterministic approach in which ...
hamburg.de In current research, duplicate detection is usually considered as a deterministic approac...
Real world applications that deal with information extraction, such as business intelligence softwar...
International audienceAlthough there is a long line of work on identifying duplicates in relational ...
In previous work, we have introduced probabilistic description logic programs for the Semantic Web, ...
At present, many information sources are available wherever you are. Most of the time, the informati...
Sources of data uncertainty and imprecision are numerous. A way to handle this uncertainty is to ass...
The problem of identifying approximately duplicate records between databases is known, among others,...
In this paper, we propose a novel, effective and efficient probabilistic pruning criterion for proba...
Sources of data uncertainty and imprecision are numerous. A way to handle this uncertainty is to ass...
Collected data often contains uncertainties. Probabilistic databases have been proposed to manage un...
Abstract A major source of uncertainty in databases is the presence of duplicate items, i.e., record...
1 Introduction Finding duplicate records in one, or linking records from several data sets areincrea...
Probabilistic data integration is a specific kind of data integration where integration problems suc...
Databases constructed automatically through web mining and information extraction often overlap with...
In current research, duplicate detection is usually considered as a deterministic approach in which ...
hamburg.de In current research, duplicate detection is usually considered as a deterministic approac...
Real world applications that deal with information extraction, such as business intelligence softwar...
International audienceAlthough there is a long line of work on identifying duplicates in relational ...
In previous work, we have introduced probabilistic description logic programs for the Semantic Web, ...
At present, many information sources are available wherever you are. Most of the time, the informati...
Sources of data uncertainty and imprecision are numerous. A way to handle this uncertainty is to ass...
The problem of identifying approximately duplicate records between databases is known, among others,...
In this paper, we propose a novel, effective and efficient probabilistic pruning criterion for proba...
Sources of data uncertainty and imprecision are numerous. A way to handle this uncertainty is to ass...