As the demand for sophisticated Natural Language Processing (NLP) models continues to grow, so does the need for efficient pre-training techniques. Current NLP models undergo resource-intensive pre-training. In response, we introduce $FastDoc$ (Fast Pre-training Technique using Document-Level Metadata and Taxonomy), a novel approach designed to significantly reduce computational demands. $FastDoc$ leverages document metadata and domain-specific taxonomy as supervision signals. It involves continual pre-training of an open-domain transformer encoder using sentence-level embeddings, followed by fine-tuning using token-level embeddings. We evaluate $FastDoc$ on six tasks across nine datasets spanning three distinct domains. Remarkably, $FastDo...
Recent work has demonstrated that pre-training in-domain language models can boost performance when ...
Despite achieving state-of-the-art performance on many NLP tasks, the high energy cost and long infe...
The remarkable success of large language models has been driven by dense models trained on massive u...
Pre-training on larger datasets with ever increasing model size is now a proven recipe for increased...
Language models (LMs) such as BERT and GPT have revolutionized natural language processing (NLP). Ho...
Pretrained language models have shown success in various areas of natural language processing, inclu...
Recently, the development of pre-trained language models has brought natural language processing (NL...
Large language models (LLMs) have demonstrated remarkable open-domain capabilities. Traditionally, L...
Recent deep learning models for tabular data currently compete with the traditional ML models based ...
Pretrained language models have become the standard approach for many NLP tasks due to strong perfor...
In this paper, we introduce DOCmT5, a multilingual sequence-to-sequence language model pretrained wi...
With the advancement of deep learning technologies, general-purpose large models such as GPT-4 have ...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
Current pre-trained language models (PLM) are typically trained with static data, ignoring that in r...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Recent work has demonstrated that pre-training in-domain language models can boost performance when ...
Despite achieving state-of-the-art performance on many NLP tasks, the high energy cost and long infe...
The remarkable success of large language models has been driven by dense models trained on massive u...
Pre-training on larger datasets with ever increasing model size is now a proven recipe for increased...
Language models (LMs) such as BERT and GPT have revolutionized natural language processing (NLP). Ho...
Pretrained language models have shown success in various areas of natural language processing, inclu...
Recently, the development of pre-trained language models has brought natural language processing (NL...
Large language models (LLMs) have demonstrated remarkable open-domain capabilities. Traditionally, L...
Recent deep learning models for tabular data currently compete with the traditional ML models based ...
Pretrained language models have become the standard approach for many NLP tasks due to strong perfor...
In this paper, we introduce DOCmT5, a multilingual sequence-to-sequence language model pretrained wi...
With the advancement of deep learning technologies, general-purpose large models such as GPT-4 have ...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
Current pre-trained language models (PLM) are typically trained with static data, ignoring that in r...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Recent work has demonstrated that pre-training in-domain language models can boost performance when ...
Despite achieving state-of-the-art performance on many NLP tasks, the high energy cost and long infe...
The remarkable success of large language models has been driven by dense models trained on massive u...