Transfer learning from large language models (LLMs) has emerged as a powerful technique to enable knowledge-based fine-tuning for a number of tasks, adaptation of models for different domains and even languages. However, it remains an open question, if and when transfer learning will work, i.e. leading to positive or negative transfer. In this paper, we analyze the knowledge transfer across three natural language processing (NLP) tasks - text classification, sentimental analysis, and sentence similarity, using three LLMs - BERT, RoBERTa, and XLNet - and analyzing their performance, by fine-tuning on target datasets for domain and cross-lingual adaptation tasks, with and without an intermediate task training on a larger dataset. Our experime...
The traditional machine learning paradigm of training a task-specific model on one single task has l...
Machine learning models cannot easily adapt to new domains and applications. This drawback becomes d...
Intermediate-task training—fine-tuning a pretrained model on an intermediate task before fine-tuning...
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on...
Pre-trained multilingual language models show significant performance gains for zero-shot cross-ling...
Transfer learning has led to large gains in performance for nearly all NLP tasks while making downst...
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven ...
The current generation of neural network-based natural language processing models excels at learning...
Substantial progress has been made in the field of natural language processing (NLP) due to the adve...
State-of-the-art NLP systems are generally based on the assumption that the underlying models are pr...
In cross-lingual language understanding, machine translation is often utilized to enhance the transf...
Pre-trained multilingual language models play an important role in cross-lingual natural language un...
The field of natural language processing (NLP) has recently seen a large change towards using pre-tr...
Recent work has shown that neural models canbe successfully trained on multiple languagessimultaneou...
Large pretrained multilingual models, trained on dozens of languages, have delivered promising resul...
The traditional machine learning paradigm of training a task-specific model on one single task has l...
Machine learning models cannot easily adapt to new domains and applications. This drawback becomes d...
Intermediate-task training—fine-tuning a pretrained model on an intermediate task before fine-tuning...
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on...
Pre-trained multilingual language models show significant performance gains for zero-shot cross-ling...
Transfer learning has led to large gains in performance for nearly all NLP tasks while making downst...
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven ...
The current generation of neural network-based natural language processing models excels at learning...
Substantial progress has been made in the field of natural language processing (NLP) due to the adve...
State-of-the-art NLP systems are generally based on the assumption that the underlying models are pr...
In cross-lingual language understanding, machine translation is often utilized to enhance the transf...
Pre-trained multilingual language models play an important role in cross-lingual natural language un...
The field of natural language processing (NLP) has recently seen a large change towards using pre-tr...
Recent work has shown that neural models canbe successfully trained on multiple languagessimultaneou...
Large pretrained multilingual models, trained on dozens of languages, have delivered promising resul...
The traditional machine learning paradigm of training a task-specific model on one single task has l...
Machine learning models cannot easily adapt to new domains and applications. This drawback becomes d...
Intermediate-task training—fine-tuning a pretrained model on an intermediate task before fine-tuning...