Understanding model performance on unlabeled data is a fundamental challenge of developing, deploying, and maintaining AI systems. Model performance is typically evaluated using test sets or periodic manual quality assessments, both of which require laborious manual data labeling. Automated performance prediction techniques aim to mitigate this burden, but potential inaccuracy and a lack of trust in their predictions has prevented their widespread adoption. We address this core problem of performance prediction uncertainty with a method to compute prediction intervals for model performance. Our methodology uses transfer learning to train an uncertainty model to estimate the uncertainty of model performance predictions. We evaluate our appro...
When the test distribution differs from the training distribution, machine learning models can perfo...
Data drift refers to the variation in the production data compare to the training data and sometimes...
In high-dimensional prediction settings, it remains challenging to reliably estimate the test perfor...
With rapid adoption of deep learning in critical applications, the question of when and how much to ...
Abstract- A rich literature discussing techniques for adopting neural networks for metamodelling of ...
We review in this paper several methods from Statistical Learning Theory (SLT) for the performance a...
Monitoring machine learning models once they are deployed is challenging. It is even more challengin...
Monitoring machine learning models once they are deployed is challenging. It is even more challengin...
Abstract—Configurable software systems allow stakeholders to derive program variants by selecting fe...
This paper considers the quantification of the prediction performance in Gaussian process regression...
Machine learning models are typically deployed in a test setting that differs from the training sett...
Learning methods for predictive models have traditionally focused on prediction quality and model bu...
The capability of effectively quantifying the uncertainty associated to a given prediction is an imp...
Accurate trajectory prediction is vital for various applications, including autonomous vehicles. How...
Cloud service management for telecommunication operators is crucial and challengingespecially in a c...
When the test distribution differs from the training distribution, machine learning models can perfo...
Data drift refers to the variation in the production data compare to the training data and sometimes...
In high-dimensional prediction settings, it remains challenging to reliably estimate the test perfor...
With rapid adoption of deep learning in critical applications, the question of when and how much to ...
Abstract- A rich literature discussing techniques for adopting neural networks for metamodelling of ...
We review in this paper several methods from Statistical Learning Theory (SLT) for the performance a...
Monitoring machine learning models once they are deployed is challenging. It is even more challengin...
Monitoring machine learning models once they are deployed is challenging. It is even more challengin...
Abstract—Configurable software systems allow stakeholders to derive program variants by selecting fe...
This paper considers the quantification of the prediction performance in Gaussian process regression...
Machine learning models are typically deployed in a test setting that differs from the training sett...
Learning methods for predictive models have traditionally focused on prediction quality and model bu...
The capability of effectively quantifying the uncertainty associated to a given prediction is an imp...
Accurate trajectory prediction is vital for various applications, including autonomous vehicles. How...
Cloud service management for telecommunication operators is crucial and challengingespecially in a c...
When the test distribution differs from the training distribution, machine learning models can perfo...
Data drift refers to the variation in the production data compare to the training data and sometimes...
In high-dimensional prediction settings, it remains challenging to reliably estimate the test perfor...