Theoretical thesis.Bibliography: pages 49-571 Introduction -- 2 Background and literature review -- 3 Approach -- Fine-tuning DistilBERT -- 5 Gradual unfreezing experiments -- 6 Conclusion and future workPretrained transformer-based language models have achieved state-of-the-art results on various Natural Language Processing (NLP) tasks. These models can be fine-tuned on a range of downstream tasks with minimalistic modifications. However, fine-tuning a language model may result in the problem of catastrophic forgetting and tend to overfit on smaller training datasets. Therefore, gradually unfreezing the pretrained weights is a possible approach to avoid catastrophic forgetting of the knowledge learnt from the source task. Multi-task fine-t...
Knowledge probing is crucial for understanding the knowledge transfer mechanism behind the pre-train...
We propose TandA, an effective technique for fine-tuning pre-trained Transformer models for natural ...
The recent development of massive multilingual transformer networks has resulted in drastic improvem...
Transformer models have achieved promising results on natural language processing (NLP) tasks includ...
There has been an increase in the number of large and high-performing models made available for vari...
The overarching problem that Natural Language Processing (NLP) research tries to solve is linguistic...
El objetivo de este trabajo es proponer nuevas técnicas de Fine-Tuning para mejorar los modelos del ...
Large Language Models (LLMs) nowadays are used to solve more tasks, focusing on knowledge-intensive ...
The BioASQ question answering (QA) benchmark dataset contains questions in English, along with golde...
When pre-trained on large unsupervised textual corpora, language models are able to store and retri...
Large Language Models have become the core architecture upon which most modern natural language proc...
Natural Language Processing (NLP) has seen tremendous improvements over the last few years. Transfor...
Recent works have shown promising results of prompt tuning in stimulating pre-trained language model...
We developed a method for producing statistical language models for speech-driven question answering...
FrBMedQA is the first French biomedical question answering dataset, containing 41k+ passage-question...
Knowledge probing is crucial for understanding the knowledge transfer mechanism behind the pre-train...
We propose TandA, an effective technique for fine-tuning pre-trained Transformer models for natural ...
The recent development of massive multilingual transformer networks has resulted in drastic improvem...
Transformer models have achieved promising results on natural language processing (NLP) tasks includ...
There has been an increase in the number of large and high-performing models made available for vari...
The overarching problem that Natural Language Processing (NLP) research tries to solve is linguistic...
El objetivo de este trabajo es proponer nuevas técnicas de Fine-Tuning para mejorar los modelos del ...
Large Language Models (LLMs) nowadays are used to solve more tasks, focusing on knowledge-intensive ...
The BioASQ question answering (QA) benchmark dataset contains questions in English, along with golde...
When pre-trained on large unsupervised textual corpora, language models are able to store and retri...
Large Language Models have become the core architecture upon which most modern natural language proc...
Natural Language Processing (NLP) has seen tremendous improvements over the last few years. Transfor...
Recent works have shown promising results of prompt tuning in stimulating pre-trained language model...
We developed a method for producing statistical language models for speech-driven question answering...
FrBMedQA is the first French biomedical question answering dataset, containing 41k+ passage-question...
Knowledge probing is crucial for understanding the knowledge transfer mechanism behind the pre-train...
We propose TandA, an effective technique for fine-tuning pre-trained Transformer models for natural ...
The recent development of massive multilingual transformer networks has resulted in drastic improvem...