[[abstract]]The process of text categorization involves some understanding of the content of the documents and/or some previous knowledge of the categories. For the content of the documents, we use a filtering measure for feature selection in our Chinese text categorization system. We modify the formula of Term Frequency-Inverse Document Frequency (TF-IDF) to strengthen important keywords’ weights and weaken unimportant keywords’ weights. For the knowledge of the categories, we use category priority to represent the relationship between two different categories. Consequently, the experimental results show that our method can effectively not only decrease noise text but also increase the accuracy rate and recall rate of text categorization.[...
Words and n-grams are commonly used Chinese text representing units and are proved to be good featur...
Abstract- With entering into information society and the Internet developing rapidly, people could a...
Abstract—Text representation, which is a fundamental and necessary process for text-based intelligen...
[[abstract]]The process of text categorization involves some understanding of the content of the doc...
Automatic text categorization is the task of assigning natural language text documents to predefined...
[[abstract]]In this paper, we propose and evaluate approaches to categorizing Chinese texts, which c...
Automatic text categorization is the task of assigning natural language text documents to predefined...
[[abstract]]Recently research on text mining has attracted lots of attention from both industrial an...
Abstract: Giving further consideration on linguistic feature, this study proposes an algorithm of Ch...
Term weighting schemes often dominate the performance of many classifiers, such as kNN, centroid-bas...
This paper is a comparative study on representing units in Chinese text categorization. Several kind...
Text categorization task always suffers from a high dimension problem, which leads the learning syst...
Within text categorization and other data mining tasks, the use of suitable methods for term weighti...
Automatic feature selection methods such as document frequency (DF), information gain (IG), mutual i...
Words and n-grams are commonly used Chinese text representing units and are proved to be good featur...
Words and n-grams are commonly used Chinese text representing units and are proved to be good featur...
Abstract- With entering into information society and the Internet developing rapidly, people could a...
Abstract—Text representation, which is a fundamental and necessary process for text-based intelligen...
[[abstract]]The process of text categorization involves some understanding of the content of the doc...
Automatic text categorization is the task of assigning natural language text documents to predefined...
[[abstract]]In this paper, we propose and evaluate approaches to categorizing Chinese texts, which c...
Automatic text categorization is the task of assigning natural language text documents to predefined...
[[abstract]]Recently research on text mining has attracted lots of attention from both industrial an...
Abstract: Giving further consideration on linguistic feature, this study proposes an algorithm of Ch...
Term weighting schemes often dominate the performance of many classifiers, such as kNN, centroid-bas...
This paper is a comparative study on representing units in Chinese text categorization. Several kind...
Text categorization task always suffers from a high dimension problem, which leads the learning syst...
Within text categorization and other data mining tasks, the use of suitable methods for term weighti...
Automatic feature selection methods such as document frequency (DF), information gain (IG), mutual i...
Words and n-grams are commonly used Chinese text representing units and are proved to be good featur...
Words and n-grams are commonly used Chinese text representing units and are proved to be good featur...
Abstract- With entering into information society and the Internet developing rapidly, people could a...
Abstract—Text representation, which is a fundamental and necessary process for text-based intelligen...