This paper investigates active learning to improve statistical machine translation (SMT) for low-resource language pairs, i.e., when there is very little pre-existing parallel text. Since generating additional parallel text to train SMT may be costly, active sampling selects the sentences from a monolingual corpus which if translated would have maximal positive impact in training SMT models. We investigate different strategies such as density and diversity preferences as well as multistrategy methods such as modified version of DUAL and our new ensemble approach GraDUAL. These result in significant BLEU-score improvements over strong baselines when parallel training data is scarce.</p
This book provides a unified view on a new methodology for Machine Translation (MT). This methodolog...
The amount of training data in statistical machine translation is critical for translation quality. ...
© Cambridge University Press, 2015.Statistical machine translation (SMT) is gaining interest given t...
Statistical machine translation (SMT) mod-els need large bilingual corpora for train-ing, which are ...
Corpus based approaches to automatic translation such as Example Based and Statistical Machine Trans...
Statistical Machine Translation (SMT) models learn how to translate by examining a bilingual paralle...
In data-driven Machine Translation approaches, like Example-Based Machine Translation (EBMT) (Brown...
Parallel corpus is an indispensable resource for translation model training in statistical machine t...
In recent years, corpus based approaches to machine translation have become predominant, with Statis...
Traditional active learning (AL) methods for machine translation (MT) rely on heuristics. However, t...
Statistical machine translation relies heavily on available parallel corpora, but SMT may not have t...
In this article we address the issue of generating diversified translation systems from a single Sta...
Interactive-predictive translation is a collaborative iterative process, where human translators pro...
Sentence-aligned bilingual texts are a crucial resource to build statistical machine translation (SM...
Statistical machine translation (SMT) systems use statistical learning methods to learn how to trans...
This book provides a unified view on a new methodology for Machine Translation (MT). This methodolog...
The amount of training data in statistical machine translation is critical for translation quality. ...
© Cambridge University Press, 2015.Statistical machine translation (SMT) is gaining interest given t...
Statistical machine translation (SMT) mod-els need large bilingual corpora for train-ing, which are ...
Corpus based approaches to automatic translation such as Example Based and Statistical Machine Trans...
Statistical Machine Translation (SMT) models learn how to translate by examining a bilingual paralle...
In data-driven Machine Translation approaches, like Example-Based Machine Translation (EBMT) (Brown...
Parallel corpus is an indispensable resource for translation model training in statistical machine t...
In recent years, corpus based approaches to machine translation have become predominant, with Statis...
Traditional active learning (AL) methods for machine translation (MT) rely on heuristics. However, t...
Statistical machine translation relies heavily on available parallel corpora, but SMT may not have t...
In this article we address the issue of generating diversified translation systems from a single Sta...
Interactive-predictive translation is a collaborative iterative process, where human translators pro...
Sentence-aligned bilingual texts are a crucial resource to build statistical machine translation (SM...
Statistical machine translation (SMT) systems use statistical learning methods to learn how to trans...
This book provides a unified view on a new methodology for Machine Translation (MT). This methodolog...
The amount of training data in statistical machine translation is critical for translation quality. ...
© Cambridge University Press, 2015.Statistical machine translation (SMT) is gaining interest given t...