Understanding and explaining the mistakes made by trained models is critical to many machine learning objectives, such as improving robustness, addressing concept drift, and mitigating biases. However, this is often an ad hoc process that involves manually looking at the model's mistakes on many test samples and guessing at the underlying reasons for those incorrect predictions. In this paper, we propose a systematic approach, conceptual counterfactual explanations (CCE), that explains why a classifier makes a mistake on a particular test sample(s) in terms of human-understandable concepts (e.g. this zebra is misclassified as a dog because of faint stripes). We base CCE on two prior ideas: counterfactual explanations and concept activation ...
Counterfactual examples for an input - perturbations that change specific features but not others - ...
Existing visual explanation generating agents learn to fluently justify a class prediction. Conseque...
Advanced AI models are powerful in making accurate predictions for complex problems. However, these ...
The same method that creates adversarial examples (AEs) to fool image-classifiers can be used to gen...
This paper addresses the challenge of generating Counterfactual Explanations (CEs), involving the id...
Machine learning plays a role in many deployed decision systems, often in ways that are difficult or...
Counterfactual explanations are a prominent example of post-hoc interpretability methods in the expl...
We propose a novel method for explaining the predictions of any classifier. In our approach, local e...
As deep learning models are increasingly used in safety-critical applications, explainability and tr...
While deep neural network models offer unmatched classification performance, they are prone to learn...
We investigate whether three types of post hoc model explanations--feature attribution, concept acti...
Deep learning models have achieved high performance across different domains, such as medical decisi...
International audienceRecent efforts have uncovered various methods for providing explanations that ...
Counterfactual explanations (CEs) are a powerful means for understanding how decisions made by algor...
Visual counterfactual explanations identify modifications to an image that would change the predicti...
Counterfactual examples for an input - perturbations that change specific features but not others - ...
Existing visual explanation generating agents learn to fluently justify a class prediction. Conseque...
Advanced AI models are powerful in making accurate predictions for complex problems. However, these ...
The same method that creates adversarial examples (AEs) to fool image-classifiers can be used to gen...
This paper addresses the challenge of generating Counterfactual Explanations (CEs), involving the id...
Machine learning plays a role in many deployed decision systems, often in ways that are difficult or...
Counterfactual explanations are a prominent example of post-hoc interpretability methods in the expl...
We propose a novel method for explaining the predictions of any classifier. In our approach, local e...
As deep learning models are increasingly used in safety-critical applications, explainability and tr...
While deep neural network models offer unmatched classification performance, they are prone to learn...
We investigate whether three types of post hoc model explanations--feature attribution, concept acti...
Deep learning models have achieved high performance across different domains, such as medical decisi...
International audienceRecent efforts have uncovered various methods for providing explanations that ...
Counterfactual explanations (CEs) are a powerful means for understanding how decisions made by algor...
Visual counterfactual explanations identify modifications to an image that would change the predicti...
Counterfactual examples for an input - perturbations that change specific features but not others - ...
Existing visual explanation generating agents learn to fluently justify a class prediction. Conseque...
Advanced AI models are powerful in making accurate predictions for complex problems. However, these ...