In this paper, the problem of classifying a HTML documents into a hierarchy of categories is investigated in the context of cooperative information repository, named WebClassII. The hierarchy of categories is involved in all aspects of automated document classification, namely feature extraction, learning, and classification of a new document. Innovative aspects of this work are: a) an experimental study on actual Web documents which can be associated to any node in the hierarchy; b) the feature selection process; c) the automated selection of thresholds for the score returned by a classifier; d) the comparison of three different techniques (flat, hierarchical with proper training sets, hierarchical with hierarchical training sets); e)...
Abstract- This paper describes automatic document categorization based on large text hierarchy. We h...
Hierarchical categorization of documents is a task receiving growing interest due to the widespread ...
Abstract. This paper describes an intelligent information system for effectively managing huge amoun...
Abstract. In this paper, the problem of classifying a HTML documents into a hierarchy of categories ...
This paper describes a new method for the classification of a HTML document into a hierarchy of cate...
Most of the research on text categorization has focused on classifying text documents into a set of ...
Abstract. This paper describes a method for the automatic classification of a HTML document into a h...
Most of works on text categorization have focused on classifying documents into a set of categories ...
This paper describes automatic document categorization based on large text hierarchy. We handle the...
While automated methods for information organization have been around for several decades now, expon...
In this paper, we present a new technique which is the Admixture MCRDR-FCA (AMF) algorithm for Web d...
With the exponential growth of the World Wide Web, automated subject classification has become a maj...
Automatic classification of web pages is an effective way to deal with the difficulty of retrieving ...
Searching for Web sites is one of the most common tasks performed on the Web. Web page classificatio...
In this work we implement and evaluate a methodology to classify multi-labeled web documents into la...
Abstract- This paper describes automatic document categorization based on large text hierarchy. We h...
Hierarchical categorization of documents is a task receiving growing interest due to the widespread ...
Abstract. This paper describes an intelligent information system for effectively managing huge amoun...
Abstract. In this paper, the problem of classifying a HTML documents into a hierarchy of categories ...
This paper describes a new method for the classification of a HTML document into a hierarchy of cate...
Most of the research on text categorization has focused on classifying text documents into a set of ...
Abstract. This paper describes a method for the automatic classification of a HTML document into a h...
Most of works on text categorization have focused on classifying documents into a set of categories ...
This paper describes automatic document categorization based on large text hierarchy. We handle the...
While automated methods for information organization have been around for several decades now, expon...
In this paper, we present a new technique which is the Admixture MCRDR-FCA (AMF) algorithm for Web d...
With the exponential growth of the World Wide Web, automated subject classification has become a maj...
Automatic classification of web pages is an effective way to deal with the difficulty of retrieving ...
Searching for Web sites is one of the most common tasks performed on the Web. Web page classificatio...
In this work we implement and evaluate a methodology to classify multi-labeled web documents into la...
Abstract- This paper describes automatic document categorization based on large text hierarchy. We h...
Hierarchical categorization of documents is a task receiving growing interest due to the widespread ...
Abstract. This paper describes an intelligent information system for effectively managing huge amoun...