The current modus operandi in NLP involves downloading and fine-tuning pre-trained models consisting of hundreds of millions, or even billions of parameters. Storing and sharing such large trained models is expensive, slow, and time-consuming, which impedes progress towards more general and versatile NLP methods that learn from and for many tasks. Adapters-small learnt bottleneck layers inserted within each layer of a pre-trained model- ameliorate this issue by avoiding full fine-tuning of the entire model. However, sharing and integrating adapter layers is not straightforward. We propose AdapterHub, a framework that allows dynamic "stiching-in" of pre-trained adapters for different tasks and languages. The framework, built on top of the po...
Pre-trained language models received extensive attention in recent years. However, it is still chall...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
The goal of my thesis is to investigate the most influential transformer architectures and to apply ...
Transformer-based pre-trained models with millions of parameters require large storage. Recent appro...
State-of-the-art pretrained NLP models contain a hundred million to trillion parameters. Adapters pr...
State-of-the-art pretrained NLP models contain a hundred million to trillion parameters. Adapters pr...
Pre-trained language models received extensive attention in recent years. However, it is still chall...
Massively pre-trained transformer models such as BERT have gained great success in many downstream N...
Combining structured information with language models is a standing problem in NLP. Building on prev...
The main goal behind state-of-the-art pretrained multilingual models such as multilingual BERT and X...
New class Pipeline (beta): easily run and use models on down-stream NLP tasks We have added a new cl...
International audienceAdapter modules were recently introduced as an efficient alternative to fine-t...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
In 2017, Vaswani et al. proposed a new neural network architecture named Transformer. That modern ar...
Progress in natural language processing research is catalyzed by the possibilities given by the wide...
Pre-trained language models received extensive attention in recent years. However, it is still chall...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
The goal of my thesis is to investigate the most influential transformer architectures and to apply ...
Transformer-based pre-trained models with millions of parameters require large storage. Recent appro...
State-of-the-art pretrained NLP models contain a hundred million to trillion parameters. Adapters pr...
State-of-the-art pretrained NLP models contain a hundred million to trillion parameters. Adapters pr...
Pre-trained language models received extensive attention in recent years. However, it is still chall...
Massively pre-trained transformer models such as BERT have gained great success in many downstream N...
Combining structured information with language models is a standing problem in NLP. Building on prev...
The main goal behind state-of-the-art pretrained multilingual models such as multilingual BERT and X...
New class Pipeline (beta): easily run and use models on down-stream NLP tasks We have added a new cl...
International audienceAdapter modules were recently introduced as an efficient alternative to fine-t...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
In 2017, Vaswani et al. proposed a new neural network architecture named Transformer. That modern ar...
Progress in natural language processing research is catalyzed by the possibilities given by the wide...
Pre-trained language models received extensive attention in recent years. However, it is still chall...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
The goal of my thesis is to investigate the most influential transformer architectures and to apply ...