Text categorization is one of the typical machine learning tasks that suffer from an incomplete training data problem. A main reason is the existence of outliers in training data, such as non-sense documents, documents mislabeled or lying on the border between different categories, and documents that are out of the defined categories, etc. Therefore, in a text categorization task, outlier learning technique could be adopted to improve text categorization. In this paper, an outlier learning based text categorization system is proposed, where AdaBoost algorithm is adopted for outlier identifying. Simulation results reveal that the new system is successful in improving learning performance for text categorization.Computer Science, Artificial I...
[[abstract]]Each type of classifier has its own advantages as well as certain shortcomings. In this ...
This paper examines the use of inductive learning to categorize natural language documents into pred...
Text categorization is the task in which text documents are classified into one or more of predefine...
Webpage categorization has turned out to be an important topic in recent years. In a webpage, text i...
Text categorization task always suffers from a high dimension problem, which leads the learning syst...
Outlier problem is one of the typical problems in an incomplete data based machine learning system [...
[[abstract]]The process of text categorization involves some understanding of the content of the doc...
[[abstract]]The process of text categorization involves some understanding of the content of the doc...
[[abstract]]The goal of this paper is to derive extra representatives from each class to compensate ...
Abstract: Giving further consideration on linguistic feature, this study proposes an algorithm of Ch...
2nd CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2013, Chongqing, 15-1...
[[abstract]]Recently research on text mining has attracted lots of attention from both industrial an...
Considering the explosive growth of data, the increased amount of text data’s effect on the performa...
We present an approach to text categorization using machine learning techniques. The approach is dev...
This paper is a comparative study on representing units in Chinese text categorization. Several kind...
[[abstract]]Each type of classifier has its own advantages as well as certain shortcomings. In this ...
This paper examines the use of inductive learning to categorize natural language documents into pred...
Text categorization is the task in which text documents are classified into one or more of predefine...
Webpage categorization has turned out to be an important topic in recent years. In a webpage, text i...
Text categorization task always suffers from a high dimension problem, which leads the learning syst...
Outlier problem is one of the typical problems in an incomplete data based machine learning system [...
[[abstract]]The process of text categorization involves some understanding of the content of the doc...
[[abstract]]The process of text categorization involves some understanding of the content of the doc...
[[abstract]]The goal of this paper is to derive extra representatives from each class to compensate ...
Abstract: Giving further consideration on linguistic feature, this study proposes an algorithm of Ch...
2nd CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2013, Chongqing, 15-1...
[[abstract]]Recently research on text mining has attracted lots of attention from both industrial an...
Considering the explosive growth of data, the increased amount of text data’s effect on the performa...
We present an approach to text categorization using machine learning techniques. The approach is dev...
This paper is a comparative study on representing units in Chinese text categorization. Several kind...
[[abstract]]Each type of classifier has its own advantages as well as certain shortcomings. In this ...
This paper examines the use of inductive learning to categorize natural language documents into pred...
Text categorization is the task in which text documents are classified into one or more of predefine...