In this thesis we develop a new approach to exploit crowd assessors relevance judgements for IR evaluation. We compute evaluation measures based on each assessor's ground truth. These measures are then merged weighting each assessor on the basis of his expertise level, estimated as the closeness between the assessor measures and gold standard measures, on a training set. The results highlight the greater performance of s-AWARE approach with respect to the majority of tested approaches
In recent years, gathering relevance judgments through non-topic originators has become an increasin...
© 2018 ACM. While crowdsourcing offers a low-cost, scalable way to collect relevance judgments, lack...
Magnitude estimation is a psychophysical scaling technique for the measurement of sensation, where o...
Ground-truth creation is one of the most demanding activities in terms of time, effort, and resource...
Information Retrieval (IR) researchers have often used existing IR evaluation collections and transf...
The agreement between relevance assessors is an important but understudied topic in the Information ...
The agreement between relevance assessors is an important but understudied topic in the Information ...
The leitmotiv throughout this thesis is represented by IR evaluation. We discuss different issues re...
Abstract. We consider the problem of acquiring relevance judgements for in-formation retrieval (IR) ...
The performance of information retrieval (IR) systems is commonly evaluated using a test set with kn...
textIn this thesis we investigate two main problems: 1) inferring consensus from disparate inputs to...
We present ir-measures, a new tool that makes it convenient to calculate a diverse set of evaluation...
In Information Retrieval (IR) evaluation, preference judgments are collected by presenting to the as...
As the use of machine learning techniques in IR increases, the need for a sound empirical methodolog...
Relevance judgment of human assessors is inherently subjective and dynamic when evaluation datasets ...
In recent years, gathering relevance judgments through non-topic originators has become an increasin...
© 2018 ACM. While crowdsourcing offers a low-cost, scalable way to collect relevance judgments, lack...
Magnitude estimation is a psychophysical scaling technique for the measurement of sensation, where o...
Ground-truth creation is one of the most demanding activities in terms of time, effort, and resource...
Information Retrieval (IR) researchers have often used existing IR evaluation collections and transf...
The agreement between relevance assessors is an important but understudied topic in the Information ...
The agreement between relevance assessors is an important but understudied topic in the Information ...
The leitmotiv throughout this thesis is represented by IR evaluation. We discuss different issues re...
Abstract. We consider the problem of acquiring relevance judgements for in-formation retrieval (IR) ...
The performance of information retrieval (IR) systems is commonly evaluated using a test set with kn...
textIn this thesis we investigate two main problems: 1) inferring consensus from disparate inputs to...
We present ir-measures, a new tool that makes it convenient to calculate a diverse set of evaluation...
In Information Retrieval (IR) evaluation, preference judgments are collected by presenting to the as...
As the use of machine learning techniques in IR increases, the need for a sound empirical methodolog...
Relevance judgment of human assessors is inherently subjective and dynamic when evaluation datasets ...
In recent years, gathering relevance judgments through non-topic originators has become an increasin...
© 2018 ACM. While crowdsourcing offers a low-cost, scalable way to collect relevance judgments, lack...
Magnitude estimation is a psychophysical scaling technique for the measurement of sensation, where o...