AbstractIn this paper, we discuss a text categorization method based on k-means clustering feature selection. K-means is classical algorithm for data clustering in text mining, but it is seldom used for feature selection. For text data, the words that can express correct semantic in a class are usually good features. We use k-means method to capture several cluster centroids for each class, and then choose the high frequency words in centroids as the text features for categorization. The words extracted by k-means not only can represent each class clustering well, but also own high quality for semantic expression. On three normal text databases, classifiers based on our feature selection method exhibit better performances than original clas...
Master of ScienceDepartment of Computer ScienceWilliam HsuThis work describes a comparative study of...
Master of ScienceDepartment of Computer ScienceWilliam HsuThis work describes a comparative study of...
AbstractThe decision tree is a flexible and useful classification tool. But on the data with high di...
AbstractIn this paper, we discuss a text categorization method based on k-means clustering feature s...
Feature selection methods have been successfully applied to text categorization but seldom applied t...
Feature clustering is a powerful method to reduce the dimensionality of feature vectors for text cla...
Text clustering has been an overlooked field of text mining that requires more attention. Several ap...
Feature selection methods have been successfully applied to text categorization but seldom applied t...
Text classification is the task of automatically sorting a set of documents into categories from a p...
This paper presents a text clustering system developed based on a k-means type subspace clustering a...
Text categorization is the technique used for sorting a set of documents into categories from a pred...
Clustering is one of the most researched areas of data mining applications in the contemporary liter...
Text classification is the task of automatically sorting a set of documents into categories from a p...
Data mining, also known as knowledge discovery in database (KDD), is the process to discover interes...
Clustering of text data is one of tasks of text mining. It divides documents into the different cate...
Master of ScienceDepartment of Computer ScienceWilliam HsuThis work describes a comparative study of...
Master of ScienceDepartment of Computer ScienceWilliam HsuThis work describes a comparative study of...
AbstractThe decision tree is a flexible and useful classification tool. But on the data with high di...
AbstractIn this paper, we discuss a text categorization method based on k-means clustering feature s...
Feature selection methods have been successfully applied to text categorization but seldom applied t...
Feature clustering is a powerful method to reduce the dimensionality of feature vectors for text cla...
Text clustering has been an overlooked field of text mining that requires more attention. Several ap...
Feature selection methods have been successfully applied to text categorization but seldom applied t...
Text classification is the task of automatically sorting a set of documents into categories from a p...
This paper presents a text clustering system developed based on a k-means type subspace clustering a...
Text categorization is the technique used for sorting a set of documents into categories from a pred...
Clustering is one of the most researched areas of data mining applications in the contemporary liter...
Text classification is the task of automatically sorting a set of documents into categories from a p...
Data mining, also known as knowledge discovery in database (KDD), is the process to discover interes...
Clustering of text data is one of tasks of text mining. It divides documents into the different cate...
Master of ScienceDepartment of Computer ScienceWilliam HsuThis work describes a comparative study of...
Master of ScienceDepartment of Computer ScienceWilliam HsuThis work describes a comparative study of...
AbstractThe decision tree is a flexible and useful classification tool. But on the data with high di...