Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal C...
Neural machine translation (NMT), where neural networks are used to generate translations, has revol...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
In recent years, many applications are using various forms of deep learning models. Such methods are...
The current generation of neural network-based natural language processing models excels at learning...
Substantial progress has been made in the field of natural language processing (NLP) due to the adve...
Transfer learning from large language models (LLMs) has emerged as a powerful technique to enable kn...
Transfer learning is to apply knowledge or patterns learned in a particular field or task to differe...
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven ...
Recently, the development of pre-trained language models has brought natural language processing (NL...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
Machine learning models cannot easily adapt to new domains and applications. This drawback becomes d...
Transfer learning improves quality for low-resource machine translation, but it is unclear what exac...
Vocabulary transfer is a transfer learning subtask in which language models fine-tune with the corpu...
In transfer learning, two major activities, i.e., pretraining and fine-tuning, are carried out to pe...
Title: Exploring Benefits of Transfer Learning in Neural Machine Translation Author: Tom Kocmi Depar...
Neural machine translation (NMT), where neural networks are used to generate translations, has revol...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
In recent years, many applications are using various forms of deep learning models. Such methods are...
The current generation of neural network-based natural language processing models excels at learning...
Substantial progress has been made in the field of natural language processing (NLP) due to the adve...
Transfer learning from large language models (LLMs) has emerged as a powerful technique to enable kn...
Transfer learning is to apply knowledge or patterns learned in a particular field or task to differe...
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven ...
Recently, the development of pre-trained language models has brought natural language processing (NL...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
Machine learning models cannot easily adapt to new domains and applications. This drawback becomes d...
Transfer learning improves quality for low-resource machine translation, but it is unclear what exac...
Vocabulary transfer is a transfer learning subtask in which language models fine-tune with the corpu...
In transfer learning, two major activities, i.e., pretraining and fine-tuning, are carried out to pe...
Title: Exploring Benefits of Transfer Learning in Neural Machine Translation Author: Tom Kocmi Depar...
Neural machine translation (NMT), where neural networks are used to generate translations, has revol...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
In recent years, many applications are using various forms of deep learning models. Such methods are...