Target task matched parallel corpora are re-quired for statistical translation model train-ing. However, training corpora sometimes include both target task matched and un-matched sentences. In such a case, train-ing set selection can reduce the size of the translation model. In this paper, we propose a training set selection method for transla-tion model training using linear translation model interpolation and a language model technique. According to the experimental results, the proposed method reduces the translation model size by 50 % and improves BLEU score by 1.76 % in comparison with a baseline training corpus usage.
Statistical machine translation (SMT) systems use statistical learning methods to learn how to trans...
Differently from the traditional statistical MT that decomposes the translation task into distinct s...
The competitive performance of neural machine translation (NMT) critically relies on large amounts o...
Parallel corpus is an indispensable resource for translation model training in statistical machine t...
Statistical machine translation relies heavily on available parallel corpora, but SMT may not have t...
The training data size is of utmost importance for statistical machine translation (SMT), since it a...
We report on findings of exploiting large data sets for translation modeling, language mod-eling and...
Data selection techniques applied to neural machine translation (NMT) aim to increase the performanc...
We propose a novel statistical translation model to improve translation selection of collocation. In...
Machine translation is the application of machines to translate text or speech from one natural lang...
Monolingual data have been demonstrated to be helpful in improving translation quality of both stati...
Abstract. In this paper we present experiments concerning translation model adaptation for statistic...
Recent work on training of log-linear interpolation mod-els for statistical machine translation repo...
In this work, we make a study on the effect of training set on statistical language modeling (SLM). ...
Machine Translation models are trained to translate a variety of documents from one language into an...
Statistical machine translation (SMT) systems use statistical learning methods to learn how to trans...
Differently from the traditional statistical MT that decomposes the translation task into distinct s...
The competitive performance of neural machine translation (NMT) critically relies on large amounts o...
Parallel corpus is an indispensable resource for translation model training in statistical machine t...
Statistical machine translation relies heavily on available parallel corpora, but SMT may not have t...
The training data size is of utmost importance for statistical machine translation (SMT), since it a...
We report on findings of exploiting large data sets for translation modeling, language mod-eling and...
Data selection techniques applied to neural machine translation (NMT) aim to increase the performanc...
We propose a novel statistical translation model to improve translation selection of collocation. In...
Machine translation is the application of machines to translate text or speech from one natural lang...
Monolingual data have been demonstrated to be helpful in improving translation quality of both stati...
Abstract. In this paper we present experiments concerning translation model adaptation for statistic...
Recent work on training of log-linear interpolation mod-els for statistical machine translation repo...
In this work, we make a study on the effect of training set on statistical language modeling (SLM). ...
Machine Translation models are trained to translate a variety of documents from one language into an...
Statistical machine translation (SMT) systems use statistical learning methods to learn how to trans...
Differently from the traditional statistical MT that decomposes the translation task into distinct s...
The competitive performance of neural machine translation (NMT) critically relies on large amounts o...