Current language generation models suffer from issues such as repetition, incoherence, and hallucinations. An often-repeated hypothesis is that this brittleness of generation models is caused by the training and the generation procedure mismatch, also referred to as exposure bias. In this paper, we verify this hypothesis by analyzing exposure bias from an imitation learning perspective. We show that exposure bias leads to an accumulation of errors, analyze why perplexity fails to capture this accumulation, and empirically show that this accumulation results in poor generation quality. Source code to reproduce these experiments is available at https://github.com/kushalarora/quantifying_exposure_biasComment: Accepted in Findings of ACL 202
Generating information from memory not only gives a read out of the contents of memory—it makes thos...
A classic debate in cognitive science revolves around understanding how children learn complex lingu...
Prediction error is known to enhance priming effects for familiar syntactic structures; it also stre...
The standard training algorithm in neural machine translation (NMT) suffers from exposure bias, and ...
Experiments in Artificial Language Learn- ing have revealed much about the cogni- tive mechanisms un...
Recently, scores of high-performing code generation systems have surfaced. As has become a popular c...
Large language models generate complex, open-ended outputs: instead of outputting a class label they...
Successful language acquisition requires both generalization and lexically based learning. Previous ...
This paper asks whether a distinction between production-based and perception-based grammar inductio...
Language acquisition is a special kind of learning problem because the outcome of learning of one ge...
International audienceWe present a computational model of language learning via a sequence of intera...
How do language learners avoid the production of verb argument structure overgeneralization errors (...
How do language learners avoid the production of verb argument structure overgeneralization errors (...
How do language learners avoid the production of verb argument structure overgeneralization errors (...
In order to gain insight into how people acquire certain reference biases in language and how those ...
Generating information from memory not only gives a read out of the contents of memory—it makes thos...
A classic debate in cognitive science revolves around understanding how children learn complex lingu...
Prediction error is known to enhance priming effects for familiar syntactic structures; it also stre...
The standard training algorithm in neural machine translation (NMT) suffers from exposure bias, and ...
Experiments in Artificial Language Learn- ing have revealed much about the cogni- tive mechanisms un...
Recently, scores of high-performing code generation systems have surfaced. As has become a popular c...
Large language models generate complex, open-ended outputs: instead of outputting a class label they...
Successful language acquisition requires both generalization and lexically based learning. Previous ...
This paper asks whether a distinction between production-based and perception-based grammar inductio...
Language acquisition is a special kind of learning problem because the outcome of learning of one ge...
International audienceWe present a computational model of language learning via a sequence of intera...
How do language learners avoid the production of verb argument structure overgeneralization errors (...
How do language learners avoid the production of verb argument structure overgeneralization errors (...
How do language learners avoid the production of verb argument structure overgeneralization errors (...
In order to gain insight into how people acquire certain reference biases in language and how those ...
Generating information from memory not only gives a read out of the contents of memory—it makes thos...
A classic debate in cognitive science revolves around understanding how children learn complex lingu...
Prediction error is known to enhance priming effects for familiar syntactic structures; it also stre...