This paper presents a new approach for automatic document categorization. Exploiting the logical structure of the document, our approach assigns a HTML document to one or more categories (thesis, paper, call for papers, email, ...). Using a set of training documents, our approach generates a set of rules used to categorize new documents. The approach flexibility is carried out with rule weight association representing your importance in the discrimination between possible categories. This weight is dynamically modified at each new document categorization. The experimentation of the proposed approach provides satisfactory results
Available document collections are more and more required for supervised text categorization tasks. ...
This thesis studies the problem of automatically evolving a hierarchy of categories to organize the ...
Most of the research on text categorization has focused on classifying text documents into a set of ...
In this paper we propose a new approach for flexible document categorization according to the docume...
In this paper, we present a new technique which is the Admixture MCRDR-FCA (AMF) algorithm for Web d...
The automated discovery of logical structure in text documents is an important problem that has rece...
Assistance in retrieving of documents on the World Wide Web is provided either by search engines, th...
The availability of large, heterogeneous repositories of electronic documents is increasing rapidly,...
In this paper, the problem of classifying a HTML documents into a hierarchy of categories is invest...
Content-related metadata plays an important role in the effort of developing intelligent web applica...
A method for supporting document retrieval by constructing a flexible category structure is proposed...
This paper proposes a new and efficient methodology for clustering of html documents. The topic wise...
Content-related metadata plays an important role in the effort of developing intelligent web applica...
Computer Science Department, College of Computer and Information Sciences, King Saud UniversityColle...
The data deluge of information in the Web challenges internauts to organize their references to inte...
Available document collections are more and more required for supervised text categorization tasks. ...
This thesis studies the problem of automatically evolving a hierarchy of categories to organize the ...
Most of the research on text categorization has focused on classifying text documents into a set of ...
In this paper we propose a new approach for flexible document categorization according to the docume...
In this paper, we present a new technique which is the Admixture MCRDR-FCA (AMF) algorithm for Web d...
The automated discovery of logical structure in text documents is an important problem that has rece...
Assistance in retrieving of documents on the World Wide Web is provided either by search engines, th...
The availability of large, heterogeneous repositories of electronic documents is increasing rapidly,...
In this paper, the problem of classifying a HTML documents into a hierarchy of categories is invest...
Content-related metadata plays an important role in the effort of developing intelligent web applica...
A method for supporting document retrieval by constructing a flexible category structure is proposed...
This paper proposes a new and efficient methodology for clustering of html documents. The topic wise...
Content-related metadata plays an important role in the effort of developing intelligent web applica...
Computer Science Department, College of Computer and Information Sciences, King Saud UniversityColle...
The data deluge of information in the Web challenges internauts to organize their references to inte...
Available document collections are more and more required for supervised text categorization tasks. ...
This thesis studies the problem of automatically evolving a hierarchy of categories to organize the ...
Most of the research on text categorization has focused on classifying text documents into a set of ...