Abstract — Text categorization is the task of assigning prede-fined categories to natural language text. With the widely used ‘bag of words ’ representation, previous researches usually assign a word with values such that whether this word appears in the document concerned or how frequently this word appears. Although these values are useful for text categorization, they have not fully expressed the abundant information contained in the document. This paper explores the effect of other types of values, which express the distribution of a word in the document. These novel values assigned to a word are called distributional features, which include the compactness of the appearances of the word and the position of the first appearance of the w...
Automatic feature selection methods such as document frequency (DF), information gain (IG), mutual i...
A number of content management tasks, including term categorization, term clustering, and automated ...
Text genre classification is the process of identifying functional characteristics of text documents...
Abstract. In previous research of text categorization, a word is usually described by features which...
Text Categorization is traditionally done by using the term frequency and inverse document frequency...
Predefined categories can be assigned to the natural language text using for text classification. It...
In the field of Natural Language Processing, supervised machine learning is commonly used to solve c...
We study an approach to text categorization that combines distributional clustering of words and a S...
This paper applies Distributional Clustering (Pereira et al. 1993) to document classification. The ...
Automatic text categorization is the task of assigning natural language text documents to predefined...
In this paper, we study the effect of using n-grams (sequences of words of length n) for text catego...
Supervised text categorization is a machine learning task where a predefined category label is autom...
Within text categorization and other data mining tasks, the use of suitable methods for term weighti...
Automatic text categorization is the task of assigning natural language text documents to predefined...
This paper focuses on a comparative evaluation of a wide-range of text categorization methods, inclu...
Automatic feature selection methods such as document frequency (DF), information gain (IG), mutual i...
A number of content management tasks, including term categorization, term clustering, and automated ...
Text genre classification is the process of identifying functional characteristics of text documents...
Abstract. In previous research of text categorization, a word is usually described by features which...
Text Categorization is traditionally done by using the term frequency and inverse document frequency...
Predefined categories can be assigned to the natural language text using for text classification. It...
In the field of Natural Language Processing, supervised machine learning is commonly used to solve c...
We study an approach to text categorization that combines distributional clustering of words and a S...
This paper applies Distributional Clustering (Pereira et al. 1993) to document classification. The ...
Automatic text categorization is the task of assigning natural language text documents to predefined...
In this paper, we study the effect of using n-grams (sequences of words of length n) for text catego...
Supervised text categorization is a machine learning task where a predefined category label is autom...
Within text categorization and other data mining tasks, the use of suitable methods for term weighti...
Automatic text categorization is the task of assigning natural language text documents to predefined...
This paper focuses on a comparative evaluation of a wide-range of text categorization methods, inclu...
Automatic feature selection methods such as document frequency (DF), information gain (IG), mutual i...
A number of content management tasks, including term categorization, term clustering, and automated ...
Text genre classification is the process of identifying functional characteristics of text documents...