Interleaved comparison methods, which compare rankers using click data, are a promising alternative to traditional information retrieval evaluation methods that require expensive explicit judgments. A major limitation of these methods is that they assume access to live data, meaning that new data must be collected for every pair of rankers compared. We investigate the use of previously collected click data (i.e., historical data) for interleaved comparisons. We start by analyzing to what degree existing interleaved comparison methods can be applied and find that a recent probabilistic method allows such data reuse, even though it is biased when applied to historical data. We then propose an interleaved comparison method that is based on the...
Information retrieval evaluation most often involves manually as-sessing the relevance of particular...
Information retrieval evaluation most often involves manually assessing the relevance of particular ...
A result page of a modern web search engine is often much more complicated than a simple list of "te...
Interleaved comparison methods, which compare rankers using click data, are a promising alternative ...
Ranker evaluation is central to the research into search engines, be it to compare rankers or to pro...
Ranker evaluation is central to the research into search engines, be it to compare rankers or to pro...
Ranker evaluation is central to the research into search engines, be it to compare rankers or to pro...
Interleaving is an online evaluation method to compare two alternative ranking functions based on th...
Interleaving is an online evaluation method to compare two alternative ranking functions based on th...
Evaluating rankers using implicit feedback, such as clicks on documents in a result list, is an incr...
Online evaluation methods for information retrieval use implicit signals such as clicks from users t...
Evaluation methods for information retrieval systems come in three types: offline evaluation, using ...
Evaluation methods for information retrieval systems come in three types: offline evaluation, using ...
Interleaving is an increasingly popular technique for evaluating information retrieval systems based...
Interleaving is an increasingly popular technique for evaluating information retrieval systems based...
Information retrieval evaluation most often involves manually as-sessing the relevance of particular...
Information retrieval evaluation most often involves manually assessing the relevance of particular ...
A result page of a modern web search engine is often much more complicated than a simple list of "te...
Interleaved comparison methods, which compare rankers using click data, are a promising alternative ...
Ranker evaluation is central to the research into search engines, be it to compare rankers or to pro...
Ranker evaluation is central to the research into search engines, be it to compare rankers or to pro...
Ranker evaluation is central to the research into search engines, be it to compare rankers or to pro...
Interleaving is an online evaluation method to compare two alternative ranking functions based on th...
Interleaving is an online evaluation method to compare two alternative ranking functions based on th...
Evaluating rankers using implicit feedback, such as clicks on documents in a result list, is an incr...
Online evaluation methods for information retrieval use implicit signals such as clicks from users t...
Evaluation methods for information retrieval systems come in three types: offline evaluation, using ...
Evaluation methods for information retrieval systems come in three types: offline evaluation, using ...
Interleaving is an increasingly popular technique for evaluating information retrieval systems based...
Interleaving is an increasingly popular technique for evaluating information retrieval systems based...
Information retrieval evaluation most often involves manually as-sessing the relevance of particular...
Information retrieval evaluation most often involves manually assessing the relevance of particular ...
A result page of a modern web search engine is often much more complicated than a simple list of "te...