Bias in the retrieval of documents can directly influence the information access of a digital library. In the worst case, systematic favoritism for a certain type of document can render other parts of the collection invisible to users. This potential bias can be evaluated by measuring the retrievability for all documents in a collection. Previous evaluations have been performed on TREC collections using simulated query sets. The question remains, however, how representative this approach is of more realistic settings. To address this question, we investigate the effectiveness of the retrievability measure using a large digitized newspaper corpus, featuring two characteristics that distinguishes our experiments from previous studies: (1) com...
A long standing problem in the domain of Information Retrieval (IR) has been the influence of biases...
A long standing problem in the domain of Information Retrieval (IR) has been the influence of biases...
A Web archive usually contains multiple versions of documents crawled from the Web at different poin...
Bias in the retrieval of documents can directly influence the information access of a digital librar...
Bias in the retrieval of documents can directly influence the information access of a digital librar...
Bias in the retrieval of documents can directly influence the information access of a digital librar...
Bias in the retrieval of documents can directly influence the information access of a digital librar...
Retrievability is the measure of how easily a document can be retrieved using a particular retrieval...
Retrievability is the measure of how easily a document can be retrieved using a particular retrieval...
Digitized document collections often suffer from OCR errors that may impact a document's readability...
Digitized document collections often suffer from OCR errors that may impact a document's readability...
Digitized document collections often suffer from OCR errors that may impact a document's readability...
Algorithmic bias presents a dificult challenge within Information Retrieval. Long has it been known ...
Retrievability provides an alternative way to assess an Information Retrieval (IR) system by measuri...
Algorithmic bias presents a difficult challenge within Information Retrieval. Long has it been known...
A long standing problem in the domain of Information Retrieval (IR) has been the influence of biases...
A long standing problem in the domain of Information Retrieval (IR) has been the influence of biases...
A Web archive usually contains multiple versions of documents crawled from the Web at different poin...
Bias in the retrieval of documents can directly influence the information access of a digital librar...
Bias in the retrieval of documents can directly influence the information access of a digital librar...
Bias in the retrieval of documents can directly influence the information access of a digital librar...
Bias in the retrieval of documents can directly influence the information access of a digital librar...
Retrievability is the measure of how easily a document can be retrieved using a particular retrieval...
Retrievability is the measure of how easily a document can be retrieved using a particular retrieval...
Digitized document collections often suffer from OCR errors that may impact a document's readability...
Digitized document collections often suffer from OCR errors that may impact a document's readability...
Digitized document collections often suffer from OCR errors that may impact a document's readability...
Algorithmic bias presents a dificult challenge within Information Retrieval. Long has it been known ...
Retrievability provides an alternative way to assess an Information Retrieval (IR) system by measuri...
Algorithmic bias presents a difficult challenge within Information Retrieval. Long has it been known...
A long standing problem in the domain of Information Retrieval (IR) has been the influence of biases...
A long standing problem in the domain of Information Retrieval (IR) has been the influence of biases...
A Web archive usually contains multiple versions of documents crawled from the Web at different poin...