State-of-the-art NLP systems are generally based on the assumption that the underlying models are provided with vast datasets to train on. However, especially when working in multi-lingual contexts, datasets are often scarce, thus more research should be carried out in this field. This thesis investigates the benefits of introducing an additional training step when fine-tuning NLP models, named Intermediate Training, which could be exploited to augment the data used for the training phase. The Intermediate Training step is applied by training models on NLP tasks that are not strictly related to the target task, aiming to verify if the models are able to leverage the learned knowledge of such tasks. Furthermore, in order to better analyz...
Recent work has shown that neural models canbe successfully trained on multiple languagessimultaneou...
After becoming familiar with preparing text data in different formats and training different algorit...
Pretraining deep neural networks to perform language modeling - that is, to reconstruct missing word...
Intermediate-task training—fine-tuning a pretrained model on an intermediate task before fine-tuning...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
The Transformer model is a very recent, fast and powerful discovery in neural machine translation. W...
In real scenarios, a multilingual model trained to solve NLP tasks on a set of languages can be requ...
Transfer learning from large language models (LLMs) has emerged as a powerful technique to enable kn...
Abstract: The subject area of multilingual natural language processing (NLP) is concerned with the p...
Representational spaces learned via language modeling are fundamental to Natural Language Processing...
Recently there has been a lot of interest in neural network based language models. These models typi...
Recent research has shown promise in multilingual modeling, demonstrating how a single model is capa...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
Differently from the traditional statistical MT that decomposes the translation task into distinct s...
Large pre-trained masked language models have become state-of-the-art solutions for many NLP problem...
Recent work has shown that neural models canbe successfully trained on multiple languagessimultaneou...
After becoming familiar with preparing text data in different formats and training different algorit...
Pretraining deep neural networks to perform language modeling - that is, to reconstruct missing word...
Intermediate-task training—fine-tuning a pretrained model on an intermediate task before fine-tuning...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
The Transformer model is a very recent, fast and powerful discovery in neural machine translation. W...
In real scenarios, a multilingual model trained to solve NLP tasks on a set of languages can be requ...
Transfer learning from large language models (LLMs) has emerged as a powerful technique to enable kn...
Abstract: The subject area of multilingual natural language processing (NLP) is concerned with the p...
Representational spaces learned via language modeling are fundamental to Natural Language Processing...
Recently there has been a lot of interest in neural network based language models. These models typi...
Recent research has shown promise in multilingual modeling, demonstrating how a single model is capa...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
Differently from the traditional statistical MT that decomposes the translation task into distinct s...
Large pre-trained masked language models have become state-of-the-art solutions for many NLP problem...
Recent work has shown that neural models canbe successfully trained on multiple languagessimultaneou...
After becoming familiar with preparing text data in different formats and training different algorit...
Pretraining deep neural networks to perform language modeling - that is, to reconstruct missing word...