While recent language models have the ability to take long contexts as input, relatively little is known about how well the language models use longer context. We analyze language model performance on two tasks that require identifying relevant information within their input contexts: multi-document question answering and key-value retrieval. We find that performance is often highest when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts. Furthermore, performance substantially decreases as the input context grows longer, even for explicitly long-context models. Our analysis provides a better understanding of how lan...
Language Models (LMs) can perform new tasks by adapting to a few in-context examples. For humans, ex...
To enable building and testing models on long-document comprehension, we introduce QuALITY, a multip...
Models based around the transformer architecture have become one of the most prominent for solving a...
While recent language models have the ability to take long contexts as input, relatively little is k...
While large language models (LLMs) are equipped with longer text input capabilities than before, the...
The increasingly widespread adoption of large language models has highlighted the need for improving...
Knowledge-intensive tasks, such as open-domain question answering (QA), require access to a large am...
Language models (LMs) have been used in cognitive modeling as well as engineering studies -- they co...
Large Language Models (LLMs) have ushered in a transformative era in the field of natural language p...
We present a series of long-context LLMs that support effective context windows of up to 32,768 toke...
Thesis (Ph.D.)--University of Washington, 2023Language models (LMs) are at the core of almost all st...
When pre-trained on large unsupervised textual corpora, language models are able to store and retri...
As the performance of large language models rapidly improves, benchmarks are getting larger and more...
While Transformer language models (LMs) are state-of-the-art for information extraction, long text i...
Large language models have an exceptional capability to incorporate new information in a contextual ...
Language Models (LMs) can perform new tasks by adapting to a few in-context examples. For humans, ex...
To enable building and testing models on long-document comprehension, we introduce QuALITY, a multip...
Models based around the transformer architecture have become one of the most prominent for solving a...
While recent language models have the ability to take long contexts as input, relatively little is k...
While large language models (LLMs) are equipped with longer text input capabilities than before, the...
The increasingly widespread adoption of large language models has highlighted the need for improving...
Knowledge-intensive tasks, such as open-domain question answering (QA), require access to a large am...
Language models (LMs) have been used in cognitive modeling as well as engineering studies -- they co...
Large Language Models (LLMs) have ushered in a transformative era in the field of natural language p...
We present a series of long-context LLMs that support effective context windows of up to 32,768 toke...
Thesis (Ph.D.)--University of Washington, 2023Language models (LMs) are at the core of almost all st...
When pre-trained on large unsupervised textual corpora, language models are able to store and retri...
As the performance of large language models rapidly improves, benchmarks are getting larger and more...
While Transformer language models (LMs) are state-of-the-art for information extraction, long text i...
Large language models have an exceptional capability to incorporate new information in a contextual ...
Language Models (LMs) can perform new tasks by adapting to a few in-context examples. For humans, ex...
To enable building and testing models on long-document comprehension, we introduce QuALITY, a multip...
Models based around the transformer architecture have become one of the most prominent for solving a...