Information Retrieval systems rely on large test collections to measure their effectiveness in retrieving relevant documents. While the demand is high, the task of creating such test collections is laborious due to the large amounts of data that need to be annotated, and due to the intrinsic subjectivity of the task itself. In this paper we study the topical relevance from a user perspective by addressing the problems of subjectivity and ambiguity. We compare our approach and results with the established TREC annotation guidelines and results. The comparison is based on a series of crowdsourcing pilots experimenting with variables, such as relevance scale, document granularity, annotation template and the number of workers. Our results show...
htmlabstractThe performance of information retrieval (IR) systems is commonly evaluated using a test...
The judging of relevance has been a subject of study in information retrieval for a long time, espec...
The availability of test collections in Cranfield paradigm has significantly benefited the developme...
Information Retrieval systems rely on large test collections to measure their effectiveness in retri...
Information Retrieval systems rely on large test collections to measure their effectiveness in retri...
Crowdsourcing has become an alternative approach to collect relevance judgments at scale thanks to t...
The influential Text REtrieval Conference (TREC) retrieval conference has always relied upon special...
Evaluation is instrumental in the development and management of effective information retrieval syst...
Abstract. We consider the problem of acquiring relevance judgements for in-formation retrieval (IR) ...
This paper investigates the agreement of relevance assessments between official TREC judgments and t...
The performance of information retrieval (IR) systems is commonly evaluated using a test set with kn...
Magnitude estimation is a psychophysical scaling technique for the measurement of sensation, where o...
Crowdsourcing has become an alternative approach to collect relevance judgments at large scale. In t...
In recent years, gathering relevance judgments through non-topic originators has become an increasin...
Crowdsourcing relevance judgments for the evaluation of search engines is used increasingly to overc...
htmlabstractThe performance of information retrieval (IR) systems is commonly evaluated using a test...
The judging of relevance has been a subject of study in information retrieval for a long time, espec...
The availability of test collections in Cranfield paradigm has significantly benefited the developme...
Information Retrieval systems rely on large test collections to measure their effectiveness in retri...
Information Retrieval systems rely on large test collections to measure their effectiveness in retri...
Crowdsourcing has become an alternative approach to collect relevance judgments at scale thanks to t...
The influential Text REtrieval Conference (TREC) retrieval conference has always relied upon special...
Evaluation is instrumental in the development and management of effective information retrieval syst...
Abstract. We consider the problem of acquiring relevance judgements for in-formation retrieval (IR) ...
This paper investigates the agreement of relevance assessments between official TREC judgments and t...
The performance of information retrieval (IR) systems is commonly evaluated using a test set with kn...
Magnitude estimation is a psychophysical scaling technique for the measurement of sensation, where o...
Crowdsourcing has become an alternative approach to collect relevance judgments at large scale. In t...
In recent years, gathering relevance judgments through non-topic originators has become an increasin...
Crowdsourcing relevance judgments for the evaluation of search engines is used increasingly to overc...
htmlabstractThe performance of information retrieval (IR) systems is commonly evaluated using a test...
The judging of relevance has been a subject of study in information retrieval for a long time, espec...
The availability of test collections in Cranfield paradigm has significantly benefited the developme...