Most performance metrics for learning algorithms do not provide information about the misclassified instances. Knowing which instances are misclassified and understanding why they are misclassified could guide future algorithm develop-ment. In this paper, we analyze the classification of over 190,000 instances from 64 data sets and create heuristics to analyze and predict an instance’s expected dif-ficulty to classify correctly (instance hardness). We find that 5 % of the instances are misclassified by all 9 considered learning algorithms and that 17 % are mis-classified by at least half. The principal contributor to misclassification is class overlap. We demonstrate the utility of instance hardness by using it to filter hard instances from...
One of the significant problems in classification is class noise which has numerous potential conseq...
This paper presents a new approach for identifying and eliminating mislabeled instances in large or ...
Modern supervised learning algorithms can learn very accurate and complex discriminating functions. ...
Abstract Most data complexity studies have focused on characterizing the complexity of the entire da...
AbstractThis paper presents average-case analyses of instance-based learning algorithms. The algorit...
Despite the tremendous advances in machine learning (ML), training with imbalanced data still poses ...
Abstract. This paper presents an attempt to find a statistical model that predicts the hardness of t...
The empirical study of algorithms is a crucial topic in the design of new algorithms because the con...
Despite the tremendous advances in machine learning (ML), training with imbalanced data still poses ...
Storing and using specific instances improves the performance of several supervised learning algorit...
Abstract. Empirical hardness models predict a solver’s runtime for a given instance of an N P-hard p...
Given a noisy dataset, how to locate erroneous instances and attributes and rank suspicious instance...
Classes of real world datasets have various properties (such as imbalance, size, complexity, and cla...
We discuss basic sample complexity theory and it's impact on classification success evaluation,...
Class-wise characteristics of training examples affect the performance of deep classifiers. A well-s...
One of the significant problems in classification is class noise which has numerous potential conseq...
This paper presents a new approach for identifying and eliminating mislabeled instances in large or ...
Modern supervised learning algorithms can learn very accurate and complex discriminating functions. ...
Abstract Most data complexity studies have focused on characterizing the complexity of the entire da...
AbstractThis paper presents average-case analyses of instance-based learning algorithms. The algorit...
Despite the tremendous advances in machine learning (ML), training with imbalanced data still poses ...
Abstract. This paper presents an attempt to find a statistical model that predicts the hardness of t...
The empirical study of algorithms is a crucial topic in the design of new algorithms because the con...
Despite the tremendous advances in machine learning (ML), training with imbalanced data still poses ...
Storing and using specific instances improves the performance of several supervised learning algorit...
Abstract. Empirical hardness models predict a solver’s runtime for a given instance of an N P-hard p...
Given a noisy dataset, how to locate erroneous instances and attributes and rank suspicious instance...
Classes of real world datasets have various properties (such as imbalance, size, complexity, and cla...
We discuss basic sample complexity theory and it's impact on classification success evaluation,...
Class-wise characteristics of training examples affect the performance of deep classifiers. A well-s...
One of the significant problems in classification is class noise which has numerous potential conseq...
This paper presents a new approach for identifying and eliminating mislabeled instances in large or ...
Modern supervised learning algorithms can learn very accurate and complex discriminating functions. ...