The problem is the automatic synthesis of formal correcting rules for LATEX documents. Each document is represented as a syntax tree. Tree node mappings of initial documents to edited documents form the training set, which is used to generate the rules. Rules with a simple structure, which implement removal, insertion or replacing operations of single node and use linear sequence of nodes to select a position are synthesized primarily. The constructed rules are grouped based on the positions of applicability and quality. The rules that use tree-like structure of nodes to select the position are studied. The changes in the quality of the rules during the sequential increase of the training document set are analyzed
In the last years, the spread of computers and the Internet caused a significant amount of documents...
Introduction. Electronic document management systems (EDMS) are used to store, process and transmit ...
In this paper we present a novel system for automatically marking up text documents into XML. The sy...
A paper document processing system is an information system component which transforms information o...
The current spread of digital documents raised the need of effective content-based retrieval techni...
In this paper, a machine learning approach to support the user during the correction of the layout a...
This article investigates the possibility of logical structure (abstract syntax tree) automatic cons...
The present work proposes a method for the automatic extraction of textual elements within documents...
WISDOM++ is an intelligent document processing system that transforms a paper document into HTML/XML...
This dissertation describes a knowledge-based system for classifying documents based upon the layout...
Layout analysis is the process of extracting a hierarchical structure describing the layout of a pag...
This paper describes a new approach to automatically learning linguistic knowledge for spelling corr...
Knowledge-based approaches to document cate-gorization make use of well elaborated and pow-erful pat...
Klasifikacija dokumenata je jedan od osnovnih i najvažnijih problema analize tekstnih dokumenata. Na...
WISDOM is a intelligent document processing system that transforms printed information into a symbol...
In the last years, the spread of computers and the Internet caused a significant amount of documents...
Introduction. Electronic document management systems (EDMS) are used to store, process and transmit ...
In this paper we present a novel system for automatically marking up text documents into XML. The sy...
A paper document processing system is an information system component which transforms information o...
The current spread of digital documents raised the need of effective content-based retrieval techni...
In this paper, a machine learning approach to support the user during the correction of the layout a...
This article investigates the possibility of logical structure (abstract syntax tree) automatic cons...
The present work proposes a method for the automatic extraction of textual elements within documents...
WISDOM++ is an intelligent document processing system that transforms a paper document into HTML/XML...
This dissertation describes a knowledge-based system for classifying documents based upon the layout...
Layout analysis is the process of extracting a hierarchical structure describing the layout of a pag...
This paper describes a new approach to automatically learning linguistic knowledge for spelling corr...
Knowledge-based approaches to document cate-gorization make use of well elaborated and pow-erful pat...
Klasifikacija dokumenata je jedan od osnovnih i najvažnijih problema analize tekstnih dokumenata. Na...
WISDOM is a intelligent document processing system that transforms printed information into a symbol...
In the last years, the spread of computers and the Internet caused a significant amount of documents...
Introduction. Electronic document management systems (EDMS) are used to store, process and transmit ...
In this paper we present a novel system for automatically marking up text documents into XML. The sy...