Together with critical editions and translations, commentaries are one of the main genres of publication in literary and textual scholarship, and have a century-long tradition. Yet, the exploitation of thousands of digitized historical commentaries was hitherto hindered by the poor quality of Optical Character Recognition (OCR), especially on commentaries to Greek texts. In this paper, we evaluate the performances of two pipelines suitable for the OCR of historical classical commentaries. Our results show that Kraken + Ciaconna reaches a substantially lower character error rate (CER) than Tesseract/OCR-D on commentary sections with high density of polytonic Greek text (average CER 7% vs. 13%), while Tesseract/OCR-D is slightly more accurate...
Slides of the talk by Matteo Romanello, Sven Najem-Meyer and Bruce Robertson entitled Optical Charac...
In this thesis we work on recognizing the text in the book ``Rerum Frisicarum Historia'' by Ubbo Emm...
A trend to digitize historical paper-based archives has emerged in recent years, with the advent of ...
Περιέχει το πλήρες κείμενοThis paper describes a work-flow designed to populate a digital library o...
While digital libraries based on page images and automat-ically generated text have made possible ma...
Digitized collections of printed historical texts are important for research in Digital Humanities. ...
GT4HistOCR contains ground truth for research in Optical Character Recognition (OCR) technology appl...
This paper introduces HHD-Ethiopic, a new OCR dataset for historical handwritten Ethiopic script, ch...
As an effort to improve accessibility to historical documents, digitization of historical archives h...
This paper reports on high-performance Optical Character Recognition (OCR) experiments using Long Sh...
As our world enters an electronic era, it has become important to be able to quickly and easily pres...
Automatic transcription of historical handwritten documents is a challenging research problem, requi...
The study of texts using a qualitative approach remains the dominant modus operandi in humanities re...
This article presents a procedure for optical character recognition (OCR) improvement, after image p...
This article aims to quantify the impact optical character recognition (OCR) has on the quantitative...
Slides of the talk by Matteo Romanello, Sven Najem-Meyer and Bruce Robertson entitled Optical Charac...
In this thesis we work on recognizing the text in the book ``Rerum Frisicarum Historia'' by Ubbo Emm...
A trend to digitize historical paper-based archives has emerged in recent years, with the advent of ...
Περιέχει το πλήρες κείμενοThis paper describes a work-flow designed to populate a digital library o...
While digital libraries based on page images and automat-ically generated text have made possible ma...
Digitized collections of printed historical texts are important for research in Digital Humanities. ...
GT4HistOCR contains ground truth for research in Optical Character Recognition (OCR) technology appl...
This paper introduces HHD-Ethiopic, a new OCR dataset for historical handwritten Ethiopic script, ch...
As an effort to improve accessibility to historical documents, digitization of historical archives h...
This paper reports on high-performance Optical Character Recognition (OCR) experiments using Long Sh...
As our world enters an electronic era, it has become important to be able to quickly and easily pres...
Automatic transcription of historical handwritten documents is a challenging research problem, requi...
The study of texts using a qualitative approach remains the dominant modus operandi in humanities re...
This article presents a procedure for optical character recognition (OCR) improvement, after image p...
This article aims to quantify the impact optical character recognition (OCR) has on the quantitative...
Slides of the talk by Matteo Romanello, Sven Najem-Meyer and Bruce Robertson entitled Optical Charac...
In this thesis we work on recognizing the text in the book ``Rerum Frisicarum Historia'' by Ubbo Emm...
A trend to digitize historical paper-based archives has emerged in recent years, with the advent of ...