In this paper, we address the problem of selectivity estimation in a crowdsourced database. Specifically, we develop several tech-niques for using workers on a crowdsourcing platform like Ama-zon’s Mechanical Turk to estimate the fraction of items in a dataset (e.g., a collection of photos) that satisfy some property or predi-cate (e.g., photos of trees). We do this without explicitly iterat-ing through every item in the dataset. This is important in crowd-sourced query optimization to support predicate ordering and in query evaluation, when performing a GROUP BY operation with a COUNT or AVG aggregate. We compare sampling item labels, a traditional approach, to showing workers a collection of items and asking them to estimate how many sati...
The problem of “approximating the crowd” is that of estimating the crowd’s majority opinion by query...
Many important data management and analytics tasks cannot be completely addressed by automated proce...
Given a set of data items, we consider the problem of filtering them based on a set of properties th...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
In this paper, we present CrowdSense, an algorithm for estimating the crowd’s majority opinion by qu...
We propose novel algorithms for the problem of crowd- sourcing binary labels. Such binary labeling t...
Crowdsourcing has become an effective and popular tool for human-powered computation to label large ...
Due to the widespread use and importance of crowdsourcing in gathering training data at scale, the d...
Crowdsourcing is the use of human workers, usually through the Internet, for obtaining useful servic...
We study the problem of estimating continuous quantities, such as prices, proba-bilities, and point ...
Filtering a set of items, based on a set of properties that can be verified by humans, is a common a...
The problem of “approximating the crowd” is that of estimating the crowd’s majority opinion by query...
Many important data management and analytics tasks cannot be completely addressed by automated proce...
Given a set of data items, we consider the problem of filtering them based on a set of properties th...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
International audienceCrowdsourcing is a way to solve problems that need human contribution. Crowdso...
In this paper, we present CrowdSense, an algorithm for estimating the crowd’s majority opinion by qu...
We propose novel algorithms for the problem of crowd- sourcing binary labels. Such binary labeling t...
Crowdsourcing has become an effective and popular tool for human-powered computation to label large ...
Due to the widespread use and importance of crowdsourcing in gathering training data at scale, the d...
Crowdsourcing is the use of human workers, usually through the Internet, for obtaining useful servic...
We study the problem of estimating continuous quantities, such as prices, proba-bilities, and point ...
Filtering a set of items, based on a set of properties that can be verified by humans, is a common a...
The problem of “approximating the crowd” is that of estimating the crowd’s majority opinion by query...
Many important data management and analytics tasks cannot be completely addressed by automated proce...
Given a set of data items, we consider the problem of filtering them based on a set of properties th...