In the last decade, the size of deep neural architectures implied in Natural Language Processing (NLP) has increased exponentially, reaching in some cases with hundreds of billions of parameters. Although, training and deploying these huge architectures is an extremely resource-demanding process and the costs are often not affordable in real-world applications. For these reasons, lots of research and industrial efforts are investigating solutions to reduce the size of these models but at the same time maintain high performance. This work was about studying and experimenting Knowledge Distillation techniques with the goal of training smaller and cheaper models while attempting to produce a good approximation of large pre-trained ones. Th...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
One of the main problems in the field of Artificial Intelligence is the efficiency of neural network...
Natural Language Processing (NLP) has seen tremendous improvements over the last few years. Transfor...
Deep and large pre-trained language models (e.g., BERT, GPT-3) are state-of-the-art for various natu...
This paper reports on the evaluation of Deep Learning (DL) transformer architecture models for Named...
Large pre-trained transformers are on top of contemporary semantic segmentation benchmarks, but come...
Given a large Transformer model, how can we obtain a small and computationally efficient model which...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
The goal of my thesis is to investigate the most influential transformer architectures and to apply ...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
One of the main problems in the field of Artificial Intelligence is the efficiency of neural network...
Natural Language Processing (NLP) has seen tremendous improvements over the last few years. Transfor...
Deep and large pre-trained language models (e.g., BERT, GPT-3) are state-of-the-art for various natu...
This paper reports on the evaluation of Deep Learning (DL) transformer architecture models for Named...
Large pre-trained transformers are on top of contemporary semantic segmentation benchmarks, but come...
Given a large Transformer model, how can we obtain a small and computationally efficient model which...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
The goal of my thesis is to investigate the most influential transformer architectures and to apply ...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...