Datasets are rarely a realistic approximation of the target population. Say, prevalence is misrepresented, image quality is above clinical standards, etc. This mismatch is known as sampling bias. Sampling biases are a major hindrance for machine learning models. They cause significant gaps between model performance in the lab and in the real world. Our work is a solution to prevalence bias. Prevalence bias is the discrepancy between the prevalence of a pathology and its sampling rate in the training dataset, introduced upon collecting data or due to the practioner rebalancing the training batches. This paper lays the theoretical and computational framework for training models, and for prediction, in the presence of prevalence bias. Concrete...
Krautenbacher N, Theis FJ, Fuchs C. Correcting Classifiers for Sample Selection Bias in Two-Phase Ca...
Biased labelers are a systemic problem in crowdsourcing, and a comprehensive toolbox for handling th...
With the deluge of digitized information in the Big Data era, massive datasets are becoming increasi...
We derive a family of loss functions to train models in the presence of sampling bias. Examples are ...
Selection bias is a massive problem in infectious disease epidemiology that can result in needless m...
When evaluating the performance of clinical machine learning models, one must consider the deploymen...
Recent research suggests that predictions made by machine-learning models can amplify biases present...
Machine Learning is a branch of artificial intelligence focused on building applications that learn ...
Lot Quality Assurance Sampling (LQAS) applications in health have generally relied on frequentist in...
In this paper, we present a Bayesian approach to estimate the mean of a binary variable and changes ...
Machine learning and data-driven solutions open exciting opportunities in many disciplines including...
Despite the prominent use of complex survey data and the growing popularity of machine learning meth...
As machine learning (ML) models gain traction in clinical applications, understanding the impact of ...
Despite the prominent use of complex survey data and the growing popularity of machine learning meth...
Measurement error occurs frequently in observational studies investigating the relationship between...
Krautenbacher N, Theis FJ, Fuchs C. Correcting Classifiers for Sample Selection Bias in Two-Phase Ca...
Biased labelers are a systemic problem in crowdsourcing, and a comprehensive toolbox for handling th...
With the deluge of digitized information in the Big Data era, massive datasets are becoming increasi...
We derive a family of loss functions to train models in the presence of sampling bias. Examples are ...
Selection bias is a massive problem in infectious disease epidemiology that can result in needless m...
When evaluating the performance of clinical machine learning models, one must consider the deploymen...
Recent research suggests that predictions made by machine-learning models can amplify biases present...
Machine Learning is a branch of artificial intelligence focused on building applications that learn ...
Lot Quality Assurance Sampling (LQAS) applications in health have generally relied on frequentist in...
In this paper, we present a Bayesian approach to estimate the mean of a binary variable and changes ...
Machine learning and data-driven solutions open exciting opportunities in many disciplines including...
Despite the prominent use of complex survey data and the growing popularity of machine learning meth...
As machine learning (ML) models gain traction in clinical applications, understanding the impact of ...
Despite the prominent use of complex survey data and the growing popularity of machine learning meth...
Measurement error occurs frequently in observational studies investigating the relationship between...
Krautenbacher N, Theis FJ, Fuchs C. Correcting Classifiers for Sample Selection Bias in Two-Phase Ca...
Biased labelers are a systemic problem in crowdsourcing, and a comprehensive toolbox for handling th...
With the deluge of digitized information in the Big Data era, massive datasets are becoming increasi...