Classes of real world datasets have various properties (such as imbalance, size, complexity, and class distribution) that make the classification task more difficult. We investigate the robustness of six classification techniques over data having various combinations of the above mentioned properties. One artificial domain and six real world datasets are used in these experiments. Results of our analysis point to the frequency-based classifiers (such as the fuzzy and the Bayes classifiers) as being more robust over various imbalance, size, complexity, and training distribution. © 2011 Inderscience Enterprises Ltd
The field of machine learning has made a lot of progress in the recent years. As it is used more fre...
The following thesis explores the impact of the dataset distributional prop- erties on classificatio...
Most performance metrics for learning algorithms do not provide information about the misclassified ...
Abstract. Many real world datasets exhibit skewed class distributions in which almost all instances ...
In this article we analyze the effect of class distribution on classifier learning. We begin by des...
When choosing a classification rule, it is important to take into account the amount of sample data ...
This thesis studied the methodologies to improve the quality of training data in order to enhance cl...
In this paper, we test some of the most commonly used classifiers to identify which ones are the mos...
Practitioners of data mining and machine learning have long observed that the imbalance of classes i...
During the process of knowledge discovery in data, imbalanced learning data often emerges and presen...
Many of today's large data sets must be reduced in size before invoking inductive algorithms, due to...
In this contribution, the question of reporting performance of binary classifiers is opened in cont...
<p>We evaluated the robustness of our classification algorithms by testing with different sizes for ...
Abstract. A common assumption made in the field of Pattern Recog-nition is that the priors inherent ...
In the field of machine learning classification is one of the most common types to be deployed in so...
The field of machine learning has made a lot of progress in the recent years. As it is used more fre...
The following thesis explores the impact of the dataset distributional prop- erties on classificatio...
Most performance metrics for learning algorithms do not provide information about the misclassified ...
Abstract. Many real world datasets exhibit skewed class distributions in which almost all instances ...
In this article we analyze the effect of class distribution on classifier learning. We begin by des...
When choosing a classification rule, it is important to take into account the amount of sample data ...
This thesis studied the methodologies to improve the quality of training data in order to enhance cl...
In this paper, we test some of the most commonly used classifiers to identify which ones are the mos...
Practitioners of data mining and machine learning have long observed that the imbalance of classes i...
During the process of knowledge discovery in data, imbalanced learning data often emerges and presen...
Many of today's large data sets must be reduced in size before invoking inductive algorithms, due to...
In this contribution, the question of reporting performance of binary classifiers is opened in cont...
<p>We evaluated the robustness of our classification algorithms by testing with different sizes for ...
Abstract. A common assumption made in the field of Pattern Recog-nition is that the priors inherent ...
In the field of machine learning classification is one of the most common types to be deployed in so...
The field of machine learning has made a lot of progress in the recent years. As it is used more fre...
The following thesis explores the impact of the dataset distributional prop- erties on classificatio...
Most performance metrics for learning algorithms do not provide information about the misclassified ...