Information retrieval evaluation most often involves manually as-sessing the relevance of particular query-document pairs. In cases where this is difficult (such as personalized search), interleaved comparison methods are becoming increasingly common. These methods compare pairs of ranking functions based on user clicks on search results, thus better reflecting true user preferences. How-ever, by depending on clicks, there is a potential for bias. For ex-ample, users have been previously shown to be more likely to click on results with attractive titles and snippets. An interleaving eval-uation where one ranker tends to generate results that attract more clicks (without being more relevant) may thus be biased. We present an approach for det...
Interleaved comparison methods, which compare rankers using click data, are a promising alternative ...
A result page of a modern web search engine is often much more complicated than a simple list of "te...
The interactions of users with search engines can be seen as implicit relevance feedback by the user...
Information retrieval evaluation most often involves manually assessing the relevance of particular ...
Interleaving is an online evaluation method to compare two alternative ranking functions based on th...
Evaluating rankers using implicit feedback, such as clicks on documents in a result list, is an incr...
Ranker evaluation is central to the research into search engines, be it to compare rankers or to pro...
Interleaving is an increasingly popular technique for evaluating information retrieval systems based...
Ranker evaluation is central to the research into search engines, be it to compare rankers or to pro...
Leveraging clickthrough data has become a popular approach for evaluating and optimizing informatio...
The gold standard for online retrieval evaluation is AB testing. Rooted in the idea of a controlled ...
User clicks¿also known as clickthrough data¿have been cited as an implicit form of relevance feedbac...
Query logs contain rich feedback information from users interacting with search engines. Therefore, ...
We describe the results of an experiment designed to study user preferences for different orderings ...
Interleaving is an online evaluation method that compares two ranking functions by mixing their res...
Interleaved comparison methods, which compare rankers using click data, are a promising alternative ...
A result page of a modern web search engine is often much more complicated than a simple list of "te...
The interactions of users with search engines can be seen as implicit relevance feedback by the user...
Information retrieval evaluation most often involves manually assessing the relevance of particular ...
Interleaving is an online evaluation method to compare two alternative ranking functions based on th...
Evaluating rankers using implicit feedback, such as clicks on documents in a result list, is an incr...
Ranker evaluation is central to the research into search engines, be it to compare rankers or to pro...
Interleaving is an increasingly popular technique for evaluating information retrieval systems based...
Ranker evaluation is central to the research into search engines, be it to compare rankers or to pro...
Leveraging clickthrough data has become a popular approach for evaluating and optimizing informatio...
The gold standard for online retrieval evaluation is AB testing. Rooted in the idea of a controlled ...
User clicks¿also known as clickthrough data¿have been cited as an implicit form of relevance feedbac...
Query logs contain rich feedback information from users interacting with search engines. Therefore, ...
We describe the results of an experiment designed to study user preferences for different orderings ...
Interleaving is an online evaluation method that compares two ranking functions by mixing their res...
Interleaved comparison methods, which compare rankers using click data, are a promising alternative ...
A result page of a modern web search engine is often much more complicated than a simple list of "te...
The interactions of users with search engines can be seen as implicit relevance feedback by the user...