An empirical study of instance hardness

Michael R. Smith
Tony Martinez
Christophe Giraud-carrier

Publication date

October 2015

Abstract

Most performance metrics for learning algorithms do not provide information about the misclassified instances. Knowing which instances are misclassified and understanding why they are misclassified could guide future algorithm develop-ment. In this paper, we analyze the classification of over 190,000 instances from 64 data sets and create heuristics to analyze and predict an instance’s expected dif-ficulty to classify correctly (instance hardness). We find that 5 % of the instances are misclassified by all 9 considered learning algorithms and that 17 % are mis-classified by at least half. The principal contributor to misclassification is class overlap. We demonstrate the utility of instance hardness by using it to filter hard instances from...

Extracted data

We use cookies to provide a better user experience.

Data Protection

An empirical study of instance hardness

Abstract

Extracted data

An empirical study of instance hardness

Abstract

Extracted data

Related items

Related items