Pretrained language models have become the standard approach for many NLP tasks due to strong performance, but they are very expensive to train. We propose a simple and efficient learning framework, TLM, that does not rely on large-scale pretraining. Given some labeled task data and a large general corpus, TLM uses task data as queries to retrieve a tiny subset of the general corpus and jointly optimizes the task objective and the language modeling objective from scratch. On eight classification datasets in four domains, TLM achieves results better than or similar to pretrained language models (e.g., RoBERTa-Large) while reducing the training FLOPs by two orders of magnitude. With high accuracy and efficiency, we hope TLM will contribute to...
Even though many efficient transformers have been proposed, only few such models are available for s...
Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-...
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural langua...
Existing pre-trained models are generally geared towards a particular class of problems. To date, th...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Pretrained language models (PTLMs) are typically learned over a large, static corpus and further fin...
Deploying large language models (LLMs) is challenging because they are memory inefficient and comput...
Scaling language models with more data, compute and parameters has driven significant progress in na...
Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). Thes...
Recently, the development of pre-trained language models has brought natural language processing (NL...
Large Language Models (LLMs) have significantly advanced the field of Natural Language Processing (N...
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on...
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown e...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
Pretrained language models have shown success in various areas of natural language processing, inclu...
Even though many efficient transformers have been proposed, only few such models are available for s...
Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-...
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural langua...
Existing pre-trained models are generally geared towards a particular class of problems. To date, th...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Pretrained language models (PTLMs) are typically learned over a large, static corpus and further fin...
Deploying large language models (LLMs) is challenging because they are memory inefficient and comput...
Scaling language models with more data, compute and parameters has driven significant progress in na...
Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). Thes...
Recently, the development of pre-trained language models has brought natural language processing (NL...
Large Language Models (LLMs) have significantly advanced the field of Natural Language Processing (N...
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on...
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown e...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
Pretrained language models have shown success in various areas of natural language processing, inclu...
Even though many efficient transformers have been proposed, only few such models are available for s...
Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-...
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural langua...