Large-scale pretrained language models have led to significant improvements in Natural Language Processing. Unfortunately, they come at the cost of high computational and storage requirements that complicate their deployment on low-resource devices. This issue can be addressed by distilling knowledge from larger models to smaller ones through pseudo-labels on task-specific datasets. However, this can be difficult for tasks with very limited data. To overcome this challenge, we present a novel approach where knowledge can be distilled from a teacher model to a student model through the generation of synthetic data. For this to be done, we first fine-tune the teacher and student models, as well as a Natural Language Generation (NLG) model, on...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
In Natural Language Processing (NLP), applications trained on downstream tasks for text classificati...
Despite pre-trained language models such as BERT have achieved appealing performance in a wide range...
Deep and large pre-trained language models (e.g., BERT, GPT-3) are state-of-the-art for various natu...
In the natural language processing (NLP) literature, neural networks are becoming increasingly deepe...
To learn text understanding models with millions of parameters one needs massive amounts of data. In...
This paper studies the use of language models as a source of synthetic unlabeled text for NLP. We fo...
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown e...
In natural language processing (NLP) tasks, slow inference speed and huge footprints in GPU usage re...
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown e...
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown e...
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown e...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
In Natural Language Processing (NLP), applications trained on downstream tasks for text classificati...
Despite pre-trained language models such as BERT have achieved appealing performance in a wide range...
Deep and large pre-trained language models (e.g., BERT, GPT-3) are state-of-the-art for various natu...
In the natural language processing (NLP) literature, neural networks are becoming increasingly deepe...
To learn text understanding models with millions of parameters one needs massive amounts of data. In...
This paper studies the use of language models as a source of synthetic unlabeled text for NLP. We fo...
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown e...
In natural language processing (NLP) tasks, slow inference speed and huge footprints in GPU usage re...
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown e...
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown e...
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown e...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...