Classifiers can provide counts of items per class, but systematic classification errors yield biases (e.g., if a class is often misclassified as another, its size may be under-estimated). To handle classification biases, the statistics and epidemiology domains devised methods for estimating unbiased class sizes (or class probabilities) without identifying which individual items are misclassified. These bias correction methods are applicable to machine learning classifiers, but in some cases yield high result variance and increased biases. We present the applicability and drawbacks of existing methods and extend them with three novel methods. Our Sample-to-Sample method provides accurate confidence intervals for the bias correction results. ...
Recent research suggests that predictions made by machine-learning models can amplify biases present...
Machine Learning is a branch of artificial intelligence focused on building applications that learn ...
Assigning class labels to instances is a key component of the machine learning technique known as cl...
Classifiers can provide counts of items per class, but systematic classification errors yield biases...
Classifiers can provide counts of items per class, but systematic classification errors yield biases...
When applying supervised machine learning algorithms to classification, the classical goal is to rec...
AbstractIn discriminant analysis, class sizes are usually estimated by the proportion of a random sa...
supervised machine learning, estimation, mixture models, shifting class prior, nonstationary class d...
Imbalance of the classes, characterized by a disproportional ratio of observations in each class, is...
This paper promotes a new task for supervised machine learning research: quantification—the pursuit ...
Epidemiological studies often utilize stratified data in which rare outcomes or exposures are artifi...
Assigning class labels to instances is a key component of the machine learning technique known as cl...
Several methods (independent subsamples, leave-one-out, cross-validation, and bootstrapping) have be...
Several methods (independent subsamples, leave-one-out, cross-validation, and bootstrapping) have be...
Several methods (independent subsamples, leave-one-out, cross-validation, and bootstrapping) have be...
Recent research suggests that predictions made by machine-learning models can amplify biases present...
Machine Learning is a branch of artificial intelligence focused on building applications that learn ...
Assigning class labels to instances is a key component of the machine learning technique known as cl...
Classifiers can provide counts of items per class, but systematic classification errors yield biases...
Classifiers can provide counts of items per class, but systematic classification errors yield biases...
When applying supervised machine learning algorithms to classification, the classical goal is to rec...
AbstractIn discriminant analysis, class sizes are usually estimated by the proportion of a random sa...
supervised machine learning, estimation, mixture models, shifting class prior, nonstationary class d...
Imbalance of the classes, characterized by a disproportional ratio of observations in each class, is...
This paper promotes a new task for supervised machine learning research: quantification—the pursuit ...
Epidemiological studies often utilize stratified data in which rare outcomes or exposures are artifi...
Assigning class labels to instances is a key component of the machine learning technique known as cl...
Several methods (independent subsamples, leave-one-out, cross-validation, and bootstrapping) have be...
Several methods (independent subsamples, leave-one-out, cross-validation, and bootstrapping) have be...
Several methods (independent subsamples, leave-one-out, cross-validation, and bootstrapping) have be...
Recent research suggests that predictions made by machine-learning models can amplify biases present...
Machine Learning is a branch of artificial intelligence focused on building applications that learn ...
Assigning class labels to instances is a key component of the machine learning technique known as cl...