Meaningfully Debugging Model Mistakes using Conceptual Counterfactual Explanations

Abid, Abubakar
Yuksekgonul, Mert
Zou, James

Publication date

June 2022

Language

English

Abstract

Understanding and explaining the mistakes made by trained models is critical to many machine learning objectives, such as improving robustness, addressing concept drift, and mitigating biases. However, this is often an ad hoc process that involves manually looking at the model's mistakes on many test samples and guessing at the underlying reasons for those incorrect predictions. In this paper, we propose a systematic approach, conceptual counterfactual explanations (CCE), that explains why a classifier makes a mistake on a particular test sample(s) in terms of human-understandable concepts (e.g. this zebra is misclassified as a dog because of faint stripes). We base CCE on two prior ideas: counterfactual explanations and concept activation ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Meaningfully Debugging Model Mistakes using Conceptual Counterfactual Explanations

Abstract

Extracted data

Meaningfully Debugging Model Mistakes using Conceptual Counterfactual Explanations

Abstract

Extracted data

Related items

Related items