Text line segmentation is one of the key steps in historical document understanding. It is challenging due to the variety of fonts, contents, writing styles and the quality of documents that have degraded through the years. In this paper, we address the limitations that currently prevent people from building line segmentation models with a high generalization capacity. We present a study conducted using three state-of-the-art systems Doc-UFCN, dhSegment and ARU-Net and show that it is possible to build generic models trained on a wide variety of historical document datasets that can correctly segment diverse unseen pages. This paper also highlights the importance of the annotations used during training: each existing dataset is annotated ...
Many libraries, museums, and other organizations contain large collections of handwritten historical...
Historical and artistic handwritten books are valuable cultural heritage (CH) items, as they provide...
Abstract—Many libraries, museums, and other organizations contain large collections of handwritten h...
International audienceText line segmentation is one of the key steps in historical document understa...
Abstract. “Character recognition ” refers to the procedure of ‘reading ’ text using a computer, taki...
Abstract This paper presents a textline detection method for degraded historical documents. Our meth...
Line segmentation is very crucial in handwritten text recognition/analysis task. A new text line ext...
The segmentation of individual words is a crucial step in several data mining methods for historical...
Text line extraction is an essential preprocessing step in many handwritten document image analysis ...
Large-scale digitisation of historical documents demands robust methods that cope with the presence ...
Text line segmentation is one of the pre-stages of modern optical characterrecognition systems. The ...
This paper describes a text-line identification and segmentation technique that is probability based...
This paper describes a text-line identification and seg-mentation technique that is probability base...
Previous deep learning based approaches to text baseline detection in historical documents usually t...
The research presented in this thesis addresses the problem of correction of arbitrary geo...
Many libraries, museums, and other organizations contain large collections of handwritten historical...
Historical and artistic handwritten books are valuable cultural heritage (CH) items, as they provide...
Abstract—Many libraries, museums, and other organizations contain large collections of handwritten h...
International audienceText line segmentation is one of the key steps in historical document understa...
Abstract. “Character recognition ” refers to the procedure of ‘reading ’ text using a computer, taki...
Abstract This paper presents a textline detection method for degraded historical documents. Our meth...
Line segmentation is very crucial in handwritten text recognition/analysis task. A new text line ext...
The segmentation of individual words is a crucial step in several data mining methods for historical...
Text line extraction is an essential preprocessing step in many handwritten document image analysis ...
Large-scale digitisation of historical documents demands robust methods that cope with the presence ...
Text line segmentation is one of the pre-stages of modern optical characterrecognition systems. The ...
This paper describes a text-line identification and segmentation technique that is probability based...
This paper describes a text-line identification and seg-mentation technique that is probability base...
Previous deep learning based approaches to text baseline detection in historical documents usually t...
The research presented in this thesis addresses the problem of correction of arbitrary geo...
Many libraries, museums, and other organizations contain large collections of handwritten historical...
Historical and artistic handwritten books are valuable cultural heritage (CH) items, as they provide...
Abstract—Many libraries, museums, and other organizations contain large collections of handwritten h...