GPT-2 and BERT demonstrate the effectiveness of using pre-trained language models (LMs) on various natural language processing tasks. However, LM fine-tuning often suffers from catastrophic forgetting when applied to resource-rich tasks. In this work, we introduce a concerted training framework (CTnmt) that is the key to integrate the pre-trained LMs to neural machine translation (NMT). Our proposed CTnmt} consists of three techniques: a) asymptotic distillation to ensure that the NMT model can retain the previous pre-trained knowledge; b) a dynamic switching gate to avoid catastrophic forgetting of pre-trained knowledge; and c) a strategy to adjust the learning paces according to a scheduled policy. Our experiments in machine translation s...
Neural Machine Translation (NMT) is a new approach to machine translation that has made great progre...
Multilingual Neural Machine Translation (MNMT) trains a single NMT model that supports translation b...
Pre-trained language models received extensive attention in recent years. However, it is still chall...
GPT-2 and BERT demonstrate the effectiveness of using pre-trained language models (LMs) on various n...
Pre-training and fine-tuning have achieved great success in natural language process field. The stan...
Can we utilize extremely large monolingual text to improve neural machine translation without the ex...
Pre-training and fine-tuning have become the de facto paradigm in many natural language processing (...
Differently from the traditional statistical MT that decomposes the translation task into distinct s...
Improving machine translation (MT) systems with translation memories (TMs) is of great interest to p...
Humans benefit from communication but suffer from language barriers. Machine translation (MT) aims t...
Transfer learning is to apply knowledge or patterns learned in a particular field or task to differe...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Scarcity of parallel sentence-pairs poses a significant hurdle for training high-quality Neural Mach...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) is often described as ‘data hungry’ as it typically requires large ...
Neural Machine Translation (NMT) is a new approach to machine translation that has made great progre...
Multilingual Neural Machine Translation (MNMT) trains a single NMT model that supports translation b...
Pre-trained language models received extensive attention in recent years. However, it is still chall...
GPT-2 and BERT demonstrate the effectiveness of using pre-trained language models (LMs) on various n...
Pre-training and fine-tuning have achieved great success in natural language process field. The stan...
Can we utilize extremely large monolingual text to improve neural machine translation without the ex...
Pre-training and fine-tuning have become the de facto paradigm in many natural language processing (...
Differently from the traditional statistical MT that decomposes the translation task into distinct s...
Improving machine translation (MT) systems with translation memories (TMs) is of great interest to p...
Humans benefit from communication but suffer from language barriers. Machine translation (MT) aims t...
Transfer learning is to apply knowledge or patterns learned in a particular field or task to differe...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Scarcity of parallel sentence-pairs poses a significant hurdle for training high-quality Neural Mach...
Neural machine translation (NMT) systems have greatly improved the quality available from machine tr...
Neural machine translation (NMT) is often described as ‘data hungry’ as it typically requires large ...
Neural Machine Translation (NMT) is a new approach to machine translation that has made great progre...
Multilingual Neural Machine Translation (MNMT) trains a single NMT model that supports translation b...
Pre-trained language models received extensive attention in recent years. However, it is still chall...