A prerequisite for training corpus-based machine translation (MT) systems – either Statistical MT (SMT) or Neural MT (NMT) – is the availability of high-quality parallel data. This is arguably more important today than ever before, as NMT has been shown in many studies to outperform SMT, but mostly when large parallel corpora are available; in cases where data is limited, SMT can still outperform NMT. Recently researchers have shown that back-translating monolingual data can be used to create synthetic parallel corpora, which in turn can be used in combination with authentic parallel data to train a high-quality NMT system. Given that large collections of new parallel text become available only quite rarely, back-translation has become the ...
Telugu is the fifteenth most commonly spoken language in the world with an estimated reach of 75 mil...
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NM...
This paper discusses the role played by parallel corpora in the design and implementation of fully a...
A prerequisite for training corpus-based machine translation (MT) systems – either Statistical MT (S...
Neural Machine Translation (NMT) models achieve their best performance when large sets of parallel d...
Machine translation (MT) has benefited from using synthetic training data originating from translati...
We consider a low-resource translation task from Finnish into Northern Sámi. Collecting all availabl...
Neural Machine Translation has achieved state-of-the-art performance for several language pairs usin...
Neural machine translation (NMT) has been a mainstream method for the machine translation (MT) task....
Neural machine translation (NMT) is often described as ‘data hungry’ as it typically requires large ...
Back translation is one of the most widely used methods for improving the performance of neural mach...
Data selection has proven its merit for improving Neural Machine Translation (NMT), when applied to ...
Monolingual data have been demonstrated to be helpful in improving translation quality of both stati...
Most Indian languages lack sufficient parallel data for Machine Translation (MT) training. In this s...
With the advent of deep neural networks in recent years, Neural Machine Translation (NMT) systems ha...
Telugu is the fifteenth most commonly spoken language in the world with an estimated reach of 75 mil...
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NM...
This paper discusses the role played by parallel corpora in the design and implementation of fully a...
A prerequisite for training corpus-based machine translation (MT) systems – either Statistical MT (S...
Neural Machine Translation (NMT) models achieve their best performance when large sets of parallel d...
Machine translation (MT) has benefited from using synthetic training data originating from translati...
We consider a low-resource translation task from Finnish into Northern Sámi. Collecting all availabl...
Neural Machine Translation has achieved state-of-the-art performance for several language pairs usin...
Neural machine translation (NMT) has been a mainstream method for the machine translation (MT) task....
Neural machine translation (NMT) is often described as ‘data hungry’ as it typically requires large ...
Back translation is one of the most widely used methods for improving the performance of neural mach...
Data selection has proven its merit for improving Neural Machine Translation (NMT), when applied to ...
Monolingual data have been demonstrated to be helpful in improving translation quality of both stati...
Most Indian languages lack sufficient parallel data for Machine Translation (MT) training. In this s...
With the advent of deep neural networks in recent years, Neural Machine Translation (NMT) systems ha...
Telugu is the fifteenth most commonly spoken language in the world with an estimated reach of 75 mil...
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NM...
This paper discusses the role played by parallel corpora in the design and implementation of fully a...