In this paper, we take the advantage of previous pre-trained models (PTMs) and propose a novel Chinese Pre-trained Unbalanced Transformer (CPT). Different from previous Chinese PTMs, CPT is designed to utilize the shared knowledge between natural language understanding (NLU) and natural language generation (NLG) to boost the performance. CPT consists of three parts: a shared encoder, an understanding decoder, and a generation decoder. Two specific decoders with a shared encoder are pre-trained with masked language modeling (MLM) and denoising auto-encoding (DAE) tasks, respectively. With the partially shared architecture and multi-task pre-training, CPT can (1) learn specific knowledge of both NLU or NLG tasks with two decoders and (2) be f...
Some Transformer-based models can perform cross-lingual transfer learning: those models can be train...
The Transformer model is a very recent, fast and powerful discovery in neural machine translation. W...
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NM...
Recently, the development of pre-trained language models has brought natural language processing (NL...
Transformer-based neural models are used in many AI applications. Training these models is expensive...
Neural machine translation (NMT) is a data-driven machine translation approach that has proven its s...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Structure prediction (SP) tasks are important in natural language understanding in the sense that th...
Machine translation has received significant attention in the field of natural language processing n...
Pre-training and fine-tuning have become the de facto paradigm in many natural language processing (...
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the W...
GPT-2 and BERT demonstrate the effectiveness of using pre-trained language models (LMs) on various n...
Transformer is a neural machine translation model which revolutionizes machine translation. Compared...
Pre-trained sequence-to-sequence models have significantly improved Neural Machine Translation (NMT)...
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on...
Some Transformer-based models can perform cross-lingual transfer learning: those models can be train...
The Transformer model is a very recent, fast and powerful discovery in neural machine translation. W...
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NM...
Recently, the development of pre-trained language models has brought natural language processing (NL...
Transformer-based neural models are used in many AI applications. Training these models is expensive...
Neural machine translation (NMT) is a data-driven machine translation approach that has proven its s...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Structure prediction (SP) tasks are important in natural language understanding in the sense that th...
Machine translation has received significant attention in the field of natural language processing n...
Pre-training and fine-tuning have become the de facto paradigm in many natural language processing (...
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the W...
GPT-2 and BERT demonstrate the effectiveness of using pre-trained language models (LMs) on various n...
Transformer is a neural machine translation model which revolutionizes machine translation. Compared...
Pre-trained sequence-to-sequence models have significantly improved Neural Machine Translation (NMT)...
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on...
Some Transformer-based models can perform cross-lingual transfer learning: those models can be train...
The Transformer model is a very recent, fast and powerful discovery in neural machine translation. W...
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NM...