Can we utilize extremely large monolingual text to improve neural machine translation without the expensive back-translation? Neural machine translation models are trained on parallel bilingual corpus. Even the large ones only include 20 to 40 millions of parallel sentence pairs. In the meanwhile, pre-trained language models such as BERT and GPT are trained on usually billions of monolingual sentences. Direct use BERT as the initialization for Transformer encoder could not gain any benefit, due to the catastrophic forgetting problem of BERT knowledge during further training on MT data. This example shows how to run the CTNMT (Yang et al. 2020) training method that integrates BERT into a Transformer MT model, the first successful method to d...
This article describes our experiments in neural machine translation using the recent Tensor2Tensor ...
With the advent of deep neural networks in recent years, Neural Machine Translation (NMT) systems ha...
Title: Exploring Benefits of Transfer Learning in Neural Machine Translation Author: Tom Kocmi Depar...
GPT-2 and BERT demonstrate the effectiveness of using pre-trained language models (LMs) on various n...
Neural machine translation (NMT) is often described as ‘data hungry’ as it typically requires large ...
Humans benefit from communication but suffer from language barriers. Machine translation (MT) aims t...
Pre-training and fine-tuning have achieved great success in natural language process field. The stan...
Pre-trained transformer is a class of neural networks behind many recent natural language processing...
A prerequisite for training corpus-based machine translation (MT) systems – either Statistical MT (S...
Pre-trained language models received extensive attention in recent years. However, it is still chall...
A prerequisite for training corpus-based machine translation (MT) systems – either Statistical MT (S...
Neural machine translation (NMT), where neural networks are used to generate translations, has revol...
Monolingual data have been demonstrated to be helpful in improving translation quality of both stati...
Neural machine translation (NMT) has been a mainstream method for the machine translation (MT) task....
Pre-training and fine-tuning have become the de facto paradigm in many natural language processing (...
This article describes our experiments in neural machine translation using the recent Tensor2Tensor ...
With the advent of deep neural networks in recent years, Neural Machine Translation (NMT) systems ha...
Title: Exploring Benefits of Transfer Learning in Neural Machine Translation Author: Tom Kocmi Depar...
GPT-2 and BERT demonstrate the effectiveness of using pre-trained language models (LMs) on various n...
Neural machine translation (NMT) is often described as ‘data hungry’ as it typically requires large ...
Humans benefit from communication but suffer from language barriers. Machine translation (MT) aims t...
Pre-training and fine-tuning have achieved great success in natural language process field. The stan...
Pre-trained transformer is a class of neural networks behind many recent natural language processing...
A prerequisite for training corpus-based machine translation (MT) systems – either Statistical MT (S...
Pre-trained language models received extensive attention in recent years. However, it is still chall...
A prerequisite for training corpus-based machine translation (MT) systems – either Statistical MT (S...
Neural machine translation (NMT), where neural networks are used to generate translations, has revol...
Monolingual data have been demonstrated to be helpful in improving translation quality of both stati...
Neural machine translation (NMT) has been a mainstream method for the machine translation (MT) task....
Pre-training and fine-tuning have become the de facto paradigm in many natural language processing (...
This article describes our experiments in neural machine translation using the recent Tensor2Tensor ...
With the advent of deep neural networks in recent years, Neural Machine Translation (NMT) systems ha...
Title: Exploring Benefits of Transfer Learning in Neural Machine Translation Author: Tom Kocmi Depar...