We introduce a novel two-stage automatic XML mark-up system, which combines the WEBSOM approach to document categorisation in conjunction with the C5 inductive learning algorithm. The WEBSOM method clusters the XML marked-up documents such that semantically similar documents lie close together on a Self-Organising Map (SOM). The C5 algorithm automatically learns and applies mark-up rules derived from the nearest SOM neighbours of an unmarked document. The system learns from mark-up errors to improve accuracy. The automatically marked-up documents produced by the system are also categorized on the Self-Organizing Map, to further refine SOM's document coverage
Extensible Markup Language (XML) is a simple and flexible text format derived from Standard Generali...
International audienceEditing an XML document manually is a complicated task. While many XML editors...
XML is becoming increasingly popular as a language for representing many types of electronic documen...
We present a novel system for automatically marking up text documents into XML and discuss the benef...
We introduce a novel two-stage automatic XML mark-up system, which combines the WEBSOM approach to ...
In this paper we present a novel system for automatically marking up text documents into XML. The sy...
In this paper we present a novel system that can automatically mark up text documents into XML. The ...
Abstract: In this paper we present a system which automatically converts text documents into XML by ...
In this paper we present a novel system which automatically converts text documents into XML by extr...
The number of XML documents produced and available on the Internet is steadily increasing. It is th...
This thesis describes a new optimisation and new heuristics for automatically marking up XML documen...
Self-Organizing Maps capable of encoding structured information will be used for the clustering of X...
Self-Organizing Maps capable of encoding structured information will be used for the clustering of X...
XML has become the universal data format for a wide variety of information systems. The large number...
Extensible Markup Language (XML) is a simple and flexible text format derived from Standard Generali...
International audienceEditing an XML document manually is a complicated task. While many XML editors...
XML is becoming increasingly popular as a language for representing many types of electronic documen...
We present a novel system for automatically marking up text documents into XML and discuss the benef...
We introduce a novel two-stage automatic XML mark-up system, which combines the WEBSOM approach to ...
In this paper we present a novel system for automatically marking up text documents into XML. The sy...
In this paper we present a novel system that can automatically mark up text documents into XML. The ...
Abstract: In this paper we present a system which automatically converts text documents into XML by ...
In this paper we present a novel system which automatically converts text documents into XML by extr...
The number of XML documents produced and available on the Internet is steadily increasing. It is th...
This thesis describes a new optimisation and new heuristics for automatically marking up XML documen...
Self-Organizing Maps capable of encoding structured information will be used for the clustering of X...
Self-Organizing Maps capable of encoding structured information will be used for the clustering of X...
XML has become the universal data format for a wide variety of information systems. The large number...
Extensible Markup Language (XML) is a simple and flexible text format derived from Standard Generali...
International audienceEditing an XML document manually is a complicated task. While many XML editors...
XML is becoming increasingly popular as a language for representing many types of electronic documen...