Transformers are responsible for the vast majority of recent advances in natural language processing. The majority of practical natural language processing applications of these models are typically enabled through transfer learning. This paper studies if corpus-specific tokenization used for fine-tuning improves the resulting performance of the model. Through a series of experiments, we demonstrate that such tokenization combined with the initialization and fine-tuning strategy for the vocabulary tokens speeds up the transfer and boosts the performance of the fine-tuned model. We call this aspect of transfer facilitation vocabulary transfer
This chapter presents an overview of the state of the art in natural language processing, exploring ...
Transformer-based pre-trained models with millions of parameters require large storage. Recent appro...
We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset ...
Vocabulary transfer is a transfer learning subtask in which language models fine-tune with the corpu...
The current generation of neural network-based natural language processing models excels at learning...
Transfer learning improves quality for low-resource machine translation, but it is unclear what exac...
Parameter-efficient fine-tuning methods (PEFTs) offer the promise of adapting large pre-trained mode...
Natural Language Processing (NLP) has seen tremendous improvements over the last few years. Transfor...
Data augmentation can improve model’s final accuracy by introducing new data samples to the dataset....
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on...
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
The field of natural language processing (NLP) has recently undergone a paradigm shift. Since the in...
Pretrained Transformers achieve state-of-the-art performance in various code-processing tasks but ma...
This chapter presents an overview of the state of the art in natural language processing, exploring ...
Transformer-based pre-trained models with millions of parameters require large storage. Recent appro...
We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset ...
Vocabulary transfer is a transfer learning subtask in which language models fine-tune with the corpu...
The current generation of neural network-based natural language processing models excels at learning...
Transfer learning improves quality for low-resource machine translation, but it is unclear what exac...
Parameter-efficient fine-tuning methods (PEFTs) offer the promise of adapting large pre-trained mode...
Natural Language Processing (NLP) has seen tremendous improvements over the last few years. Transfor...
Data augmentation can improve model’s final accuracy by introducing new data samples to the dataset....
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on...
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
The field of natural language processing (NLP) has recently undergone a paradigm shift. Since the in...
Pretrained Transformers achieve state-of-the-art performance in various code-processing tasks but ma...
This chapter presents an overview of the state of the art in natural language processing, exploring ...
Transformer-based pre-trained models with millions of parameters require large storage. Recent appro...
We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset ...