Pretrained Language Models (LMs) have demonstrated ability to perform numerical reasoning by extrapolating from a few examples in few-shot settings. However, the extent to which this extrapolation relies on robust reasoning is unclear. In this paper, we investigate how well these models reason with terms that are less frequent in the pretraining data. In particular, we examine the correlations between the model performance on test instances and the frequency of terms from those instances in the pretraining data. We measure the strength of this correlation for a number of GPT-based language models (pretrained on the Pile dataset) on various numerical deduction tasks (e.g., arithmetic and unit conversion). Our results consistently demonstrate...
International audienceNonword reading performance, that is, the ability to generate plausible pronun...
Natural Language Inference (NLI) models are known to learn from biases and artefacts within their tr...
More predictable words are easier to process - they are read faster and elicit smaller neural signal...
Working on a larger, more general topic: «Large Language Models (LLMs). Learning and Reasoning at th...
Recent prompt-based approaches allow pretrained language models to achieve strong performances on fe...
State-of-the-art pretrained language models tend to perform below their capabilities when applied ou...
Prompt-tuning has shown appealing performance in few-shot classification by virtue of its capability...
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language proce...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
How can prompting a large language model like GPT-3 with explanations improve in-context learning? W...
State-of-the-art pre-trained language models have been shown to memorise facts and per- form well wi...
A better understanding of the emergent computation and problem-solving capabilities of recent large ...
While pre-trained language models (PLMs) are the go-to solution to tackle many natural language proc...
Despite their wide adoption, the underlying training and memorization dynamics of very large languag...
Through their transfer learning abilities, highly-parameterized large pre-trained language models ha...
International audienceNonword reading performance, that is, the ability to generate plausible pronun...
Natural Language Inference (NLI) models are known to learn from biases and artefacts within their tr...
More predictable words are easier to process - they are read faster and elicit smaller neural signal...
Working on a larger, more general topic: «Large Language Models (LLMs). Learning and Reasoning at th...
Recent prompt-based approaches allow pretrained language models to achieve strong performances on fe...
State-of-the-art pretrained language models tend to perform below their capabilities when applied ou...
Prompt-tuning has shown appealing performance in few-shot classification by virtue of its capability...
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language proce...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
How can prompting a large language model like GPT-3 with explanations improve in-context learning? W...
State-of-the-art pre-trained language models have been shown to memorise facts and per- form well wi...
A better understanding of the emergent computation and problem-solving capabilities of recent large ...
While pre-trained language models (PLMs) are the go-to solution to tackle many natural language proc...
Despite their wide adoption, the underlying training and memorization dynamics of very large languag...
Through their transfer learning abilities, highly-parameterized large pre-trained language models ha...
International audienceNonword reading performance, that is, the ability to generate plausible pronun...
Natural Language Inference (NLI) models are known to learn from biases and artefacts within their tr...
More predictable words are easier to process - they are read faster and elicit smaller neural signal...