Automated feature selection is important for text categorization to reduce feature size and to speed up learning process of classifiers. In this paper, we present a novel and efficient feature selection framework based on the Information Theory, which aims to rank the features with their discriminative capacity for classification. We first revisit two information measures: Kullback-Leibler divergence and Jeffreys divergence for binary hypothesis testing, and analyze their asymptotic properties relating to type I and type II errors of a Bayesian classifier. We then introduce a new divergence measure, called Jeffreys-Multi-Hypothesis (JMH) divergence, to measure multi-distribution divergence for multi-class classification. Based on the JMH-di...
The problem of feature selection is critical in several areas of machine learning and data analysis ...
The underlying assumption in traditional machine learning algorithms is that instances are Independe...
AbstractNaïve Bayes classifiers which are widely used for text classification in machine learning ar...
In this paper, we present a new wrapper feature selection approach based on Jensen-Shannon (JS) dive...
Abstract. A major characteristic of text document classification problem is extremely high dimension...
Application of a feature selection algorithm to a textual data set can improve the performance of so...
The automated classification of texts into predefined categories has witnessed a booming interest, d...
In this paper, we present a Bayesian classification approach for automatic text categorization using...
Text categorization is the task of discovering the category or class text documents belongs to, or i...
There are numerous text documents available in electronic form. More and more are becoming available...
Feature selection has been extensively applied in statistical pattern recognition as a mechanism for...
Abstract—This paper focuses on enhancing feature selection (FS) performance on a classification data...
There are numerous text documents available in electronic form. More and more are becoming available...
Many feature selection methods have been proposed for text categorization. However, their performanc...
Abstract—Maximum entropy approach to classification is very well studied in applied statistics and m...
The problem of feature selection is critical in several areas of machine learning and data analysis ...
The underlying assumption in traditional machine learning algorithms is that instances are Independe...
AbstractNaïve Bayes classifiers which are widely used for text classification in machine learning ar...
In this paper, we present a new wrapper feature selection approach based on Jensen-Shannon (JS) dive...
Abstract. A major characteristic of text document classification problem is extremely high dimension...
Application of a feature selection algorithm to a textual data set can improve the performance of so...
The automated classification of texts into predefined categories has witnessed a booming interest, d...
In this paper, we present a Bayesian classification approach for automatic text categorization using...
Text categorization is the task of discovering the category or class text documents belongs to, or i...
There are numerous text documents available in electronic form. More and more are becoming available...
Feature selection has been extensively applied in statistical pattern recognition as a mechanism for...
Abstract—This paper focuses on enhancing feature selection (FS) performance on a classification data...
There are numerous text documents available in electronic form. More and more are becoming available...
Many feature selection methods have been proposed for text categorization. However, their performanc...
Abstract—Maximum entropy approach to classification is very well studied in applied statistics and m...
The problem of feature selection is critical in several areas of machine learning and data analysis ...
The underlying assumption in traditional machine learning algorithms is that instances are Independe...
AbstractNaïve Bayes classifiers which are widely used for text classification in machine learning ar...