Crowdsourcing is a popular technique to collect large amounts of human-generated labels, such as relevance judgments used to create information retrieval (IR) evaluation collections. Previous research has shown how collecting high quality labels from a crowdsourcing platform can be challenging. Existing quality assurance techniques focus on answer aggregation or on the use of gold questions where ground-truth data allows to check for the quality of the responses. In this paper, we present qualitative and quantitative results, revealing how different crowd workers adopt different work strategies to complete relevance judgment tasks efficiently and their consequent impact on quality. We delve into the techniques and tools that highly experien...
The ability to entice and engage crowd workers to participate in human intelligence tasks (HITs) is ...
The ability to entice and engage crowd workers to partici-pate in human intelligence tasks (HITs) is...
Crowdsourcing has become a standard methodology to collect manually annotated data such as relevance...
Crowdsourcing is a popular technique to collect large amounts of human-generated labels, such as rel...
Crowdsourcing relevance judgments for the evaluation of search engines is used increasingly to overc...
Information retrieval systems require human contributed relevance labels for their training and eval...
Crowdsourcing has become an alternative approach to collect relevance judgments at scale thanks to t...
Crowdsourcing platforms offer unprecedented opportunities for creating evaluation benchmarks, but su...
The performance of information retrieval (IR) systems is commonly evaluated using a test set with kn...
The suitability of crowdsourcing to solve a variety of problems has been investigated widely. Yet, t...
Crowdsourcing has become an alternative approach to collect relevance judgments at scale thanks to t...
While crowd workers typically complete a variety of tasks in crowdsourcing platforms, there is no wi...
Abstract Matching crowd workers to suitable tasks is highly desirable as it can enhance task perfor...
ABSTRACT How can we best use crowdsourcing to perform a subjective labeling task with low inter-rate...
Abstract. We consider the problem of acquiring relevance judgements for in-formation retrieval (IR) ...
The ability to entice and engage crowd workers to participate in human intelligence tasks (HITs) is ...
The ability to entice and engage crowd workers to partici-pate in human intelligence tasks (HITs) is...
Crowdsourcing has become a standard methodology to collect manually annotated data such as relevance...
Crowdsourcing is a popular technique to collect large amounts of human-generated labels, such as rel...
Crowdsourcing relevance judgments for the evaluation of search engines is used increasingly to overc...
Information retrieval systems require human contributed relevance labels for their training and eval...
Crowdsourcing has become an alternative approach to collect relevance judgments at scale thanks to t...
Crowdsourcing platforms offer unprecedented opportunities for creating evaluation benchmarks, but su...
The performance of information retrieval (IR) systems is commonly evaluated using a test set with kn...
The suitability of crowdsourcing to solve a variety of problems has been investigated widely. Yet, t...
Crowdsourcing has become an alternative approach to collect relevance judgments at scale thanks to t...
While crowd workers typically complete a variety of tasks in crowdsourcing platforms, there is no wi...
Abstract Matching crowd workers to suitable tasks is highly desirable as it can enhance task perfor...
ABSTRACT How can we best use crowdsourcing to perform a subjective labeling task with low inter-rate...
Abstract. We consider the problem of acquiring relevance judgements for in-formation retrieval (IR) ...
The ability to entice and engage crowd workers to participate in human intelligence tasks (HITs) is ...
The ability to entice and engage crowd workers to partici-pate in human intelligence tasks (HITs) is...
Crowdsourcing has become a standard methodology to collect manually annotated data such as relevance...