Natural Language Inference is a challenging task that has received substantial attention, and state-of-the-art models now achieve impressive test set performance in the form of accuracy scores. Here, we go beyond this single evaluation metric to examine robustness to semantically-valid alterations to the input data. We identify three factors - insensitivity, polarity and unseen pairs - and compare their impact on three SNLI models under a variety of conditions. Our results demonstrate a number of strengths and weaknesses in the models’ ability to generalise to new in-domain instances. In particular, while strong performance is possible on unseen hypernyms, unseen antonyms are more challenging for all the models. More generally, the models s...
We present a human-in-the-loop dashboard tailored to diagnosing potential spurious features that NLI...
Modern language models based on deep artificial neural networks have achieved impressive progress in...
The recent success of deep learning neural language models such as Bidirectional Encoder Representat...
Natural Language Inference is a challenging task that has received substantial attention, and state-...
Success in natural language inference (NLI) should require a model to understand both lexical and co...
Natural Language Inference (NLI) models are known to learn from biases and artefacts within their tr...
Nature language inference (NLI) task is a predictive task of determining the inference relationship ...
Natural Language Inference (NLI) plays an important role in many natural language processing tasks s...
Natural language inference (NLI) models are susceptible to learning shortcuts, i.e. decision rules t...
Natural Language Inference (NLI) models are known to learn from biases and artefacts within their tr...
The outstanding performance recently reached by Neural Language Models (NLMs) across many Natural La...
Large language models like T5 perform excellently on various NLI benchmarks. However, it has been sh...
Do state-of-the-art models for language understanding already have, or can they easily learn, abilit...
With the recent success of deep learning methods, neural-based models have achieved superior perform...
With the recent success of deep learning methods, neural-based models have achieved superior perform...
We present a human-in-the-loop dashboard tailored to diagnosing potential spurious features that NLI...
Modern language models based on deep artificial neural networks have achieved impressive progress in...
The recent success of deep learning neural language models such as Bidirectional Encoder Representat...
Natural Language Inference is a challenging task that has received substantial attention, and state-...
Success in natural language inference (NLI) should require a model to understand both lexical and co...
Natural Language Inference (NLI) models are known to learn from biases and artefacts within their tr...
Nature language inference (NLI) task is a predictive task of determining the inference relationship ...
Natural Language Inference (NLI) plays an important role in many natural language processing tasks s...
Natural language inference (NLI) models are susceptible to learning shortcuts, i.e. decision rules t...
Natural Language Inference (NLI) models are known to learn from biases and artefacts within their tr...
The outstanding performance recently reached by Neural Language Models (NLMs) across many Natural La...
Large language models like T5 perform excellently on various NLI benchmarks. However, it has been sh...
Do state-of-the-art models for language understanding already have, or can they easily learn, abilit...
With the recent success of deep learning methods, neural-based models have achieved superior perform...
With the recent success of deep learning methods, neural-based models have achieved superior perform...
We present a human-in-the-loop dashboard tailored to diagnosing potential spurious features that NLI...
Modern language models based on deep artificial neural networks have achieved impressive progress in...
The recent success of deep learning neural language models such as Bidirectional Encoder Representat...