A central question in natural language understanding (NLU) research is whether high performance demonstrates the models' strong reasoning capabilities. We present an extensive series of controlled experiments where pre-trained language models are exposed to data that have undergone specific corruption transformations. These involve removing instances of specific word classes and often lead to non-sensical sentences. Our results show that performance remains high on most GLUE tasks when the models are fine-tuned or tested on corrupted data, suggesting that they leverage other cues for prediction even in non-sensical contexts. Our proposed data transformations can be used to assess the extent to which a specific dataset constitutes a proper t...
Do state-of-the-art models for language understanding already have, or can they easily learn, abilit...
Spurious correlations are a threat to the trustworthiness of natural language processing systems, mo...
Modern neural language models that are widely used in various NLP tasks risk memorizing sensitive in...
Many believe human-level natural language inference (NLI) has already been achieved. In reality, mod...
Language models, given their black-box nature, often exhibit sensitivity to input perturbations, lea...
The outstanding performance recently reached by Neural Language Models (NLMs) across many Natural La...
Natural language understanding (NLU) models tend to rely on spurious correlations (i.e., dataset bia...
Natural Language Inference (NLI) models are known to learn from biases and artefacts within their tr...
Natural Language Inference (NLI) models are known to learn from biases and artefacts within their tr...
In NLP, models are usually evaluated by reporting single-number performance scores on a number of re...
In recent years, language models (LMs) have made remarkable progress in advancing the field of natu...
Natural Language Understanding (NLU) is a branch of Natural Language Processing (NLP) that uses inte...
Success in natural language inference (NLI) should require a model to understand both lexical and co...
Current natural language processing (NLP) models such as BERT and RoBERTa have achieved high overall...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
Do state-of-the-art models for language understanding already have, or can they easily learn, abilit...
Spurious correlations are a threat to the trustworthiness of natural language processing systems, mo...
Modern neural language models that are widely used in various NLP tasks risk memorizing sensitive in...
Many believe human-level natural language inference (NLI) has already been achieved. In reality, mod...
Language models, given their black-box nature, often exhibit sensitivity to input perturbations, lea...
The outstanding performance recently reached by Neural Language Models (NLMs) across many Natural La...
Natural language understanding (NLU) models tend to rely on spurious correlations (i.e., dataset bia...
Natural Language Inference (NLI) models are known to learn from biases and artefacts within their tr...
Natural Language Inference (NLI) models are known to learn from biases and artefacts within their tr...
In NLP, models are usually evaluated by reporting single-number performance scores on a number of re...
In recent years, language models (LMs) have made remarkable progress in advancing the field of natu...
Natural Language Understanding (NLU) is a branch of Natural Language Processing (NLP) that uses inte...
Success in natural language inference (NLI) should require a model to understand both lexical and co...
Current natural language processing (NLP) models such as BERT and RoBERTa have achieved high overall...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
Do state-of-the-art models for language understanding already have, or can they easily learn, abilit...
Spurious correlations are a threat to the trustworthiness of natural language processing systems, mo...
Modern neural language models that are widely used in various NLP tasks risk memorizing sensitive in...