Taxonomies of the Web typically have hundreds of thousands of categories and skewed category distribution over documents. It is not clear whether existing text classification technologies can perform well on and scale up to such large-scale applications. To understand this, we conducted the evaluation of several representative methods (Support Vector Machines, k-Nearest Neighbor and Naive Bayes) with Yahoo! taxonomies. In particular, we evaluated the effectiveness/efficiency tradeoff in classifiers with hierarchical setting compared to conventional (flat) setting, and tested popular threshold tuning strategies for their scalability and accuracy in large-scale classification problems.EI
International audienceGoing beyond the traditional text classification, involving a few tens of clas...
Automatic categorization is a viable method to deal with the scaling problem on the World Wide Web. ...
Abstract — Large-scale classification taxonomies have thousands of classes, deep hierarchies and ske...
We present an approach to text categorization using machine learning techniques. The approach is dev...
This paper describes automatic document categorization based on large text hierarchy. We handle the...
Abstract- This paper describes automatic document categorization based on large text hierarchy. We h...
Most of the research on text categorization has focused on classifying text documents into a set of ...
In this work we implement and evaluate a methodology to classify multi-labeled web documents into la...
Abstract. In the context of web-scale taxonomies such as Mozilla and Yahoo! 1 directories, previous ...
Text documents in the web are in hierarchy, increase in the content, information grows over the year...
Patent classification is a large scale hierarchical text classification (LSHTC) task. Though compreh...
Poster paper 0344International audienceWhile multi-class categorization of documents has been of res...
Most of the research on text categorization has focused on mapping text documents to a set of catego...
Abstract. This paper describes a method for the automatic classification of a HTML document into a h...
International audienceWe study in this paper flat and hierarchical classification strategies in the ...
International audienceGoing beyond the traditional text classification, involving a few tens of clas...
Automatic categorization is a viable method to deal with the scaling problem on the World Wide Web. ...
Abstract — Large-scale classification taxonomies have thousands of classes, deep hierarchies and ske...
We present an approach to text categorization using machine learning techniques. The approach is dev...
This paper describes automatic document categorization based on large text hierarchy. We handle the...
Abstract- This paper describes automatic document categorization based on large text hierarchy. We h...
Most of the research on text categorization has focused on classifying text documents into a set of ...
In this work we implement and evaluate a methodology to classify multi-labeled web documents into la...
Abstract. In the context of web-scale taxonomies such as Mozilla and Yahoo! 1 directories, previous ...
Text documents in the web are in hierarchy, increase in the content, information grows over the year...
Patent classification is a large scale hierarchical text classification (LSHTC) task. Though compreh...
Poster paper 0344International audienceWhile multi-class categorization of documents has been of res...
Most of the research on text categorization has focused on mapping text documents to a set of catego...
Abstract. This paper describes a method for the automatic classification of a HTML document into a h...
International audienceWe study in this paper flat and hierarchical classification strategies in the ...
International audienceGoing beyond the traditional text classification, involving a few tens of clas...
Automatic categorization is a viable method to deal with the scaling problem on the World Wide Web. ...
Abstract — Large-scale classification taxonomies have thousands of classes, deep hierarchies and ske...