Entity resolution is central to data integration and data cleaning. Algorithmic approaches have been improving in quality, but remain far from perfect. Crowdsourcing plat-forms offer a more accurate but expensive (and slow) way to bring human insight into the process. Previous work has proposed batching verification tasks for presentation to human workers but even with batching, a human-only ap-proach is infeasible for data sets of even moderate size, due to the large numbers of matches to be tested. Instead, we propose a hybrid human-machine approach in which ma-chines are used to do an initial, coarse pass over all the data, and people are used to verify only the most likely matching pairs. We show that for such a hybrid system, generatin...
Many important data management and analytics tasks cannot be completely addressed by automated proce...
Crowdsourcing enables programmers to incorporate “human com-putation ” as a building block in algori...
Large-scale distributed computing has made available the resources necessary to solve "AI-hard" prob...
There are several computational tasks for which the help of people is useful. One such task is entit...
We study the problem of enhancing Entity Resolution (ER) with the help of crowdsourcing. ER is the p...
In this paper, we study a hybrid human-machine approach for solving the problem of Entity Resolution...
Crowdsourcing has been established as an essential means to scale human computation in diverse Web a...
Entity resolution (ER) is the task of identifying all records in a database that refer to the same u...
We investigated the use of supervised learning methods that use labels from crowd workers to resolve...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
In recent years, crowdsourcing has become essential in a wide range of Web applications. One of the ...
In recent years, crowdsourcing has become essential in a wide range of Web applications. One of the ...
Given a set of data items, we consider the problem of filtering them based on a set of properties th...
Crowdsourcing has revolutionised the way tasks can be completed but the process is frequently ineffi...
Recent approaches to crowdsourcing entity matching (EM) are limited in that they crowdsource only pa...
Many important data management and analytics tasks cannot be completely addressed by automated proce...
Crowdsourcing enables programmers to incorporate “human com-putation ” as a building block in algori...
Large-scale distributed computing has made available the resources necessary to solve "AI-hard" prob...
There are several computational tasks for which the help of people is useful. One such task is entit...
We study the problem of enhancing Entity Resolution (ER) with the help of crowdsourcing. ER is the p...
In this paper, we study a hybrid human-machine approach for solving the problem of Entity Resolution...
Crowdsourcing has been established as an essential means to scale human computation in diverse Web a...
Entity resolution (ER) is the task of identifying all records in a database that refer to the same u...
We investigated the use of supervised learning methods that use labels from crowd workers to resolve...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
In recent years, crowdsourcing has become essential in a wide range of Web applications. One of the ...
In recent years, crowdsourcing has become essential in a wide range of Web applications. One of the ...
Given a set of data items, we consider the problem of filtering them based on a set of properties th...
Crowdsourcing has revolutionised the way tasks can be completed but the process is frequently ineffi...
Recent approaches to crowdsourcing entity matching (EM) are limited in that they crowdsource only pa...
Many important data management and analytics tasks cannot be completely addressed by automated proce...
Crowdsourcing enables programmers to incorporate “human com-putation ” as a building block in algori...
Large-scale distributed computing has made available the resources necessary to solve "AI-hard" prob...