Humanities scholars increasingly rely on digital archives for their research in place of time-consuming visits to physical archives. This shift in research methodology has the hidden cost of working with digi- tally processed historical documents: how much trust can a scholar place in noisy representations of source texts? In a series of interviews with historians about their use of digital archives, we found that scholars are aware that optical character recognition (OCR) errors may bias their results. They were, however, unable to quantify this bias or to indicate what information they would need to estimate it. Based on the interviews and a literature study, we provide a classification scheme relating schol- arly research tasks to their ...
The study of texts using a qualitative approach remains the dominant modus operandi in humanities re...
Recent advances in Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) have l...
Existing research offers fearful conclusions on the use of online archival collections, finding that...
Humanities scholars increasingly rely on digital archives for their research in place of time-consum...
Humanities scholars increasingly rely on digital archives for their research instead of time-consumi...
Cultural heritage institutions increasingly make their collections digitally available. Consequently...
This article aims to quantify the impact optical character recognition (OCR) has on the quantitative...
The millions of pages of historical documents that are digitized in libraries are increasingly used ...
Effects of Optical Character Recognition (OCR) quality on historical information retrieval have so f...
ABSTRACT Historical newspapers are increasingly accessed digitally for different purposes both by p...
Book chapter that documents the “Mapping Texts” project, an experiment focused on the problem of OCR...
Digitized document collections often suffer from OCR errors that may impact a document's readability...
Computing and the use of digital sources and resources is an everyday and essential practice in curr...
Iterating with new and improved OCR solutions enforces decision making when it comes to targeting th...
Digitization of historical documents is a challenging task in many digital humanities projects. A po...
The study of texts using a qualitative approach remains the dominant modus operandi in humanities re...
Recent advances in Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) have l...
Existing research offers fearful conclusions on the use of online archival collections, finding that...
Humanities scholars increasingly rely on digital archives for their research in place of time-consum...
Humanities scholars increasingly rely on digital archives for their research instead of time-consumi...
Cultural heritage institutions increasingly make their collections digitally available. Consequently...
This article aims to quantify the impact optical character recognition (OCR) has on the quantitative...
The millions of pages of historical documents that are digitized in libraries are increasingly used ...
Effects of Optical Character Recognition (OCR) quality on historical information retrieval have so f...
ABSTRACT Historical newspapers are increasingly accessed digitally for different purposes both by p...
Book chapter that documents the “Mapping Texts” project, an experiment focused on the problem of OCR...
Digitized document collections often suffer from OCR errors that may impact a document's readability...
Computing and the use of digital sources and resources is an everyday and essential practice in curr...
Iterating with new and improved OCR solutions enforces decision making when it comes to targeting th...
Digitization of historical documents is a challenging task in many digital humanities projects. A po...
The study of texts using a qualitative approach remains the dominant modus operandi in humanities re...
Recent advances in Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) have l...
Existing research offers fearful conclusions on the use of online archival collections, finding that...