To effortlessly digitise historical documents has risen to be of great interest for some time. Part of the digitisation is what is called annotating of the data. Such data annotations are obtained in a process called alignment which links words in an image to the transcript. Annotated data have many use cases such as being used in the training of handwritten text recognition models. Relevant to the application above, this project aimed to develop an interactive algorithm for the segmentation and alignment of historical document images. Two different developed methods (referred to as method 1 and method 2) were evaluated and compared on two different data sets Labour’sMemory and IAM. A method to incorporate self-learning was also developed a...
Separating content from noise in historical manuscripts is a fundamental task in digital palaeograph...
This work aims to simplify the tiresome manual compari-son of two similar Arabic historical manuscri...
International audienceMany challenges and open issues related to the tremendous growth in digitizing...
We describe our work on text-image alignment in context of building a historical document retrieval ...
Computerized analysis of handwritten documents is an active research area in image analysis and comp...
The growth of digital libraries has yielded a large number of handwritten historical documents in th...
The current state of the art for automatic transcription of historical manuscripts is typically limi...
Há uma vasta quantidade de informação nos textos antigos manuscritos e tipografados, e grandes esfor...
International audienceWritten texts are both abstract and physical objects: ideas, signs and shapes,...
Indexing and searching collections of handwritten archival documents and manuscripts has always been...
Indexing and searching collections of handwritten archival documents and manuscripts has always been...
The term "historical documents" encompasses an enormous variety of document types considering differ...
Abstract — A number of binarization techniques have been proposed in the past for automatic document...
The aim of this thesis is to build and evaluate how a word segmentation algorithm performs when extr...
Historical and artistic handwritten books are valuable cultural heritage (CH) items, as they provide...
Separating content from noise in historical manuscripts is a fundamental task in digital palaeograph...
This work aims to simplify the tiresome manual compari-son of two similar Arabic historical manuscri...
International audienceMany challenges and open issues related to the tremendous growth in digitizing...
We describe our work on text-image alignment in context of building a historical document retrieval ...
Computerized analysis of handwritten documents is an active research area in image analysis and comp...
The growth of digital libraries has yielded a large number of handwritten historical documents in th...
The current state of the art for automatic transcription of historical manuscripts is typically limi...
Há uma vasta quantidade de informação nos textos antigos manuscritos e tipografados, e grandes esfor...
International audienceWritten texts are both abstract and physical objects: ideas, signs and shapes,...
Indexing and searching collections of handwritten archival documents and manuscripts has always been...
Indexing and searching collections of handwritten archival documents and manuscripts has always been...
The term "historical documents" encompasses an enormous variety of document types considering differ...
Abstract — A number of binarization techniques have been proposed in the past for automatic document...
The aim of this thesis is to build and evaluate how a word segmentation algorithm performs when extr...
Historical and artistic handwritten books are valuable cultural heritage (CH) items, as they provide...
Separating content from noise in historical manuscripts is a fundamental task in digital palaeograph...
This work aims to simplify the tiresome manual compari-son of two similar Arabic historical manuscri...
International audienceMany challenges and open issues related to the tremendous growth in digitizing...