On guaranteed optimal robust explanations for NLP models

La Malfa, E
Michelmore, R
Zbrzezny, AM
Paoletti, N
Kwiatkowska, M

Publication date

January 2021

Publisher

International Joint Conferences on Artificial Intelligence

Abstract

We build on abduction-based explanations for machine learning and develop a method for computing local explanations for neural network models in natural language processing (NLP). Our explanations comprise a subset of the words of the input text that satisfies two key features: optimality w.r.t. a user-defined cost function, such as the length of explanation, and robustness, in that they ensure prediction invariance for any bounded perturbation in the embedding space of the left-out words. We present two solution algorithms, respectively based on implicit hitting sets and maximum universal subsets, introducing a number of algorithmic improvements to speed up convergence of hard instances. We show how our method can be configured with differ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

On guaranteed optimal robust explanations for NLP models

Abstract

Extracted data

On guaranteed optimal robust explanations for NLP models

Abstract

Extracted data

Related items

Related items