International audienceAttention mechanisms have played a crucial role in the development of complex architectures such as Transformers in natural language processing. However, Transformers remain hard to interpret and are considered as black-boxes. In this paper we assess how attention coefficients from Transformers help in providing classifier interpretability when properly aggregated. A fast and easy-to-implement way of aggregating attention is proposed to build local feature importance. A human-grounded experiment is conducted to evaluate and compare this approach to other usual interpretability methods. The experimental protocol relies on the capacity of an interpretability method to provide explanation in line with human reasoning. Exp...
The Transformer architecture aggregates input information through the self-attention mechanism, but ...
The attention mechanism is the key to many state-of-the-art transformer-based models in Natural Lang...
Transformers have emerged as a powerful tool for a broad range of natural language processing tasks....
The transformer multi-head self-attention mechanism has been thoroughly investigated recently. On o...
Pretrained transformer-based language models achieve state-of-the-art performance in many NLP tasks,...
This report introduces the Attention Visualizer package, which is crafted to visually illustrate the...
The multi-head self-attention mechanism of the transformer model has been thoroughly investigated re...
The great success of Transformer-based models benefits from the powerful multi-head self-attention m...
Large pretrained language models using the transformer neural network architecture are becoming a do...
Transformer models are revolutionizing machine learning, but their inner workings remain mysterious....
Transformers are the state-of-the-art for machine translation and grammar error correction. One of t...
The attention mechanism is considered the backbone of the widely-used Transformer architecture. It c...
The Transformer architecture aggregates input information through the self-attention mechanism, but ...
We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-at...
We take inspiration from the study of human explanation to inform the design and evaluation of inter...
The Transformer architecture aggregates input information through the self-attention mechanism, but ...
The attention mechanism is the key to many state-of-the-art transformer-based models in Natural Lang...
Transformers have emerged as a powerful tool for a broad range of natural language processing tasks....
The transformer multi-head self-attention mechanism has been thoroughly investigated recently. On o...
Pretrained transformer-based language models achieve state-of-the-art performance in many NLP tasks,...
This report introduces the Attention Visualizer package, which is crafted to visually illustrate the...
The multi-head self-attention mechanism of the transformer model has been thoroughly investigated re...
The great success of Transformer-based models benefits from the powerful multi-head self-attention m...
Large pretrained language models using the transformer neural network architecture are becoming a do...
Transformer models are revolutionizing machine learning, but their inner workings remain mysterious....
Transformers are the state-of-the-art for machine translation and grammar error correction. One of t...
The attention mechanism is considered the backbone of the widely-used Transformer architecture. It c...
The Transformer architecture aggregates input information through the self-attention mechanism, but ...
We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-at...
We take inspiration from the study of human explanation to inform the design and evaluation of inter...
The Transformer architecture aggregates input information through the self-attention mechanism, but ...
The attention mechanism is the key to many state-of-the-art transformer-based models in Natural Lang...
Transformers have emerged as a powerful tool for a broad range of natural language processing tasks....