In this paper we present and discuss a novel approach to modeling logical structures of documents, based on a statistical representation of patterns in a document class. An efficient and error tolerant recognition heuristics adapted to the model is proposed. The statistical approach permits easily automated and incremental learning of the model. The approach has been partially evaluated on a prototype. A discussion of the results achieved by the prototype is finally made. 1 Introduction The need for strongly structured documents has been recognized for a long time and their importance increases with the development of new software applications. Several approaches and systems have been proposed to address this still open issue [4, 7, 8, 10...
Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2004.Includes bibliographic...
In this paper, a methodology for document classification and understanding is proposed. It is based ...
We present a general approach for the hierarchical segmentation and labeling of document layout stru...
. This paper deals with the representation of document models used in the field of document recognit...
Most of the electronic documents available from todays huge number of electronic information sources...
Abstract. This paper deals with the representation of document models used in the field of document ...
The availability of large, heterogeneous repositories of electronic documents is increasing rapidly,...
We present a fully implemented system based on generic document knowledge for detecting the logical ...
This work introduces a practical method for performing logical layout analysis on heterogeneous peri...
. Successful applications of digital libraries require structured access to sources of information....
The use of generic model for a document class as the knowledge base in a Document Analysis System fa...
An important aspect of document understanding is document logical structure derivation, which involv...
The automated discovery of logical structure in text documents is an important problem that has rece...
National audienceDocument Analysis and Recognition consist in translating their images into an elect...
This paper presents a new research theme at our institute in the field of document engineering; it d...
Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2004.Includes bibliographic...
In this paper, a methodology for document classification and understanding is proposed. It is based ...
We present a general approach for the hierarchical segmentation and labeling of document layout stru...
. This paper deals with the representation of document models used in the field of document recognit...
Most of the electronic documents available from todays huge number of electronic information sources...
Abstract. This paper deals with the representation of document models used in the field of document ...
The availability of large, heterogeneous repositories of electronic documents is increasing rapidly,...
We present a fully implemented system based on generic document knowledge for detecting the logical ...
This work introduces a practical method for performing logical layout analysis on heterogeneous peri...
. Successful applications of digital libraries require structured access to sources of information....
The use of generic model for a document class as the knowledge base in a Document Analysis System fa...
An important aspect of document understanding is document logical structure derivation, which involv...
The automated discovery of logical structure in text documents is an important problem that has rece...
National audienceDocument Analysis and Recognition consist in translating their images into an elect...
This paper presents a new research theme at our institute in the field of document engineering; it d...
Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2004.Includes bibliographic...
In this paper, a methodology for document classification and understanding is proposed. It is based ...
We present a general approach for the hierarchical segmentation and labeling of document layout stru...