We introduce a framework for calibrating machine learning models so that their predictions satisfy explicit, finite-sample statistical guarantees. Our calibration algorithm works with any underlying model and (unknown) data-generating distribution and does not require model refitting. The framework addresses, among other examples, false discovery rate control in multi-label classification, intersection-over-union control in instance segmentation, and the simultaneous control of the type-1 error of outlier detection and confidence set coverage in classification or regression. Our main insight is to reframe the risk-control problem as multiple hypothesis testing, enabling techniques and mathematical arguments different from those in the previ...
Trustworthy machine learning is driving a large number of ML community works in order to improve ML ...
We study the performance -- and specifically the rate at which the error probability converges to ze...
Assessment of model fitness is a key part of machine learning. The standard paradigm is to learn mod...
In safety-critical applications a probabilistic model is usually required to be calibrated, i.e., to...
Deep neural networks are powerful tools to detect hidden patterns in data and leverage them to make ...
In a well-calibrated risk prediction model, the average predicted probability is close to the true e...
When deployed in the real world, machine learning models inevitably encounter changes in the data di...
A much studied issue is the extent to which the confidence scores provided by machine learning algor...
Learning probabilistic classification and prediction models that generate accurate probabilities is ...
With model trustworthiness being crucial for sensitive real-world applications, practitioners are pu...
The learning to defer (L2D) framework has the potential to make AI systems safer. For a given input,...
Deep neural networks (DNNs) have made great strides in pushing the state-of-the-art in several chall...
How and when can we depend on machine learning systems to make decisions for human-being? This is pr...
Fair calibration is a widely desirable fairness criteria in risk prediction contexts. One way to mea...
Wide-ranging digitalization has made it possible to capture increasingly larger amounts of data. In ...
Trustworthy machine learning is driving a large number of ML community works in order to improve ML ...
We study the performance -- and specifically the rate at which the error probability converges to ze...
Assessment of model fitness is a key part of machine learning. The standard paradigm is to learn mod...
In safety-critical applications a probabilistic model is usually required to be calibrated, i.e., to...
Deep neural networks are powerful tools to detect hidden patterns in data and leverage them to make ...
In a well-calibrated risk prediction model, the average predicted probability is close to the true e...
When deployed in the real world, machine learning models inevitably encounter changes in the data di...
A much studied issue is the extent to which the confidence scores provided by machine learning algor...
Learning probabilistic classification and prediction models that generate accurate probabilities is ...
With model trustworthiness being crucial for sensitive real-world applications, practitioners are pu...
The learning to defer (L2D) framework has the potential to make AI systems safer. For a given input,...
Deep neural networks (DNNs) have made great strides in pushing the state-of-the-art in several chall...
How and when can we depend on machine learning systems to make decisions for human-being? This is pr...
Fair calibration is a widely desirable fairness criteria in risk prediction contexts. One way to mea...
Wide-ranging digitalization has made it possible to capture increasingly larger amounts of data. In ...
Trustworthy machine learning is driving a large number of ML community works in order to improve ML ...
We study the performance -- and specifically the rate at which the error probability converges to ze...
Assessment of model fitness is a key part of machine learning. The standard paradigm is to learn mod...