A trend to digitize historical paper-based archives has emerged in recent years, with the advent of digital optical scanners. A lot of paper-based books, textbooks, magazines, articles, and documents are being transformed into electronic versions that can be manipulated by a computer. For this purpose, Optical Character Recognition (OCR) systems have been developed to transform scanned digital text into editable computer text. However, different kinds of errors in the OCR system output text can be found, but Automatic Error Correction tools can help in performing the quality of electronic texts by cleaning and removing noises. In this paper, we perform a qualitative and quantitative comparison of several error-correction techniques for...
Optical character recognition (OCR) is crucial for a deeper access to historical collections. OCR ne...
International audienceThe work reported in this paper aims at performance optimization in the digiti...
This paper tackles the task of named entity recognition (NER) applied to digitized historical texts ...
A trend to digitize historical paper-based archives has emerged in recent years, with the advent of ...
A trend to digitize historical paper-based archives has emerged in recent years, with the advent of ...
Optical character recognition (OCR) for historical documents is a complex procedure subject to a uni...
Optical character recognition (OCR) for historical documents is a complex procedure subject to a uni...
For indexing the content of digitized historical texts, optical character recognition (OCR) errors a...
For indexing the content of digitized historical texts, optical character recognition (OCR) errors a...
Machine Translation (MT) plays a critical role in expanding capacity in the translation industry. H...
Machine Translation (MT) plays a critical role in expanding capacity in the translation industry. H...
Machine Translation (MT) plays a critical role in expanding capacity in the translation industry. H...
Born-analog documents contain enormous knowledge which is valuable to our society. For the purpose o...
We present an approach for automatic detection and correction of OCR-induced misspellings in histori...
In this paper we describe our efforts in reducing and correcting OCR errors in the context of buildi...
Optical character recognition (OCR) is crucial for a deeper access to historical collections. OCR ne...
International audienceThe work reported in this paper aims at performance optimization in the digiti...
This paper tackles the task of named entity recognition (NER) applied to digitized historical texts ...
A trend to digitize historical paper-based archives has emerged in recent years, with the advent of ...
A trend to digitize historical paper-based archives has emerged in recent years, with the advent of ...
Optical character recognition (OCR) for historical documents is a complex procedure subject to a uni...
Optical character recognition (OCR) for historical documents is a complex procedure subject to a uni...
For indexing the content of digitized historical texts, optical character recognition (OCR) errors a...
For indexing the content of digitized historical texts, optical character recognition (OCR) errors a...
Machine Translation (MT) plays a critical role in expanding capacity in the translation industry. H...
Machine Translation (MT) plays a critical role in expanding capacity in the translation industry. H...
Machine Translation (MT) plays a critical role in expanding capacity in the translation industry. H...
Born-analog documents contain enormous knowledge which is valuable to our society. For the purpose o...
We present an approach for automatic detection and correction of OCR-induced misspellings in histori...
In this paper we describe our efforts in reducing and correcting OCR errors in the context of buildi...
Optical character recognition (OCR) is crucial for a deeper access to historical collections. OCR ne...
International audienceThe work reported in this paper aims at performance optimization in the digiti...
This paper tackles the task of named entity recognition (NER) applied to digitized historical texts ...