In this paper, we study the effect of different word-level preprocessing decisions for Arabic on SMT quality. Our results show that given large amounts of training data, splitting off only proclitics performs best. However, for small amounts of training data, it is best to apply English-like tokenization using part-of-speech tags, and sophisticated morphological analysis and disambiguation. Moreover, choosing the appropriate preprocessing produces a significant increase in BLEU score if there is a change in genre between training and test data
International audienceNeural Machine Translation (NMT) systems have been shown to perform impressive...
In this paper, we report on a set of ini-tial results for English-to-Arabic Statistical Machine Tran...
This thesis discusses different approaches to machine translation (MT) from Dialectal Arabic (DA) to...
In this paper, we study the effect of different word-level preprocessing decisions for Arabic on SMT...
Statistical machine translation is quite robust when it comes to the choice of input representation....
This is an accepted manuscript of an article published by Elsevier BV in Computer Speech & Language ...
We describe an approach to automatic source-language syntactic preprocessing in the context of Arabi...
Résumé. De nombreux travaux en Traduction Automatique Statistique (TAS) pour des langues d’en-trée m...
Thesis (Ph. D. in Information Technology)--Massachusetts Institute of Technology, Dept. of Civil and...
The research context of this paper is developing hybrid machine translation (MT) systems that exploi...
This paper is interested in improving the quality of Arabic-English statistical machine translation ...
Translating from English into a morphologically richer language like Arabic is a challenge in statis...
We present four techniques for online handling of Out-of-Vocabulary words in Phrasebased Statistical...
Arabic segmentation was already applied successfully for the task of statistical machine translation...
Arabic is considered to have a rich morphology compared to English language. This fact adversely aff...
International audienceNeural Machine Translation (NMT) systems have been shown to perform impressive...
In this paper, we report on a set of ini-tial results for English-to-Arabic Statistical Machine Tran...
This thesis discusses different approaches to machine translation (MT) from Dialectal Arabic (DA) to...
In this paper, we study the effect of different word-level preprocessing decisions for Arabic on SMT...
Statistical machine translation is quite robust when it comes to the choice of input representation....
This is an accepted manuscript of an article published by Elsevier BV in Computer Speech & Language ...
We describe an approach to automatic source-language syntactic preprocessing in the context of Arabi...
Résumé. De nombreux travaux en Traduction Automatique Statistique (TAS) pour des langues d’en-trée m...
Thesis (Ph. D. in Information Technology)--Massachusetts Institute of Technology, Dept. of Civil and...
The research context of this paper is developing hybrid machine translation (MT) systems that exploi...
This paper is interested in improving the quality of Arabic-English statistical machine translation ...
Translating from English into a morphologically richer language like Arabic is a challenge in statis...
We present four techniques for online handling of Out-of-Vocabulary words in Phrasebased Statistical...
Arabic segmentation was already applied successfully for the task of statistical machine translation...
Arabic is considered to have a rich morphology compared to English language. This fact adversely aff...
International audienceNeural Machine Translation (NMT) systems have been shown to perform impressive...
In this paper, we report on a set of ini-tial results for English-to-Arabic Statistical Machine Tran...
This thesis discusses different approaches to machine translation (MT) from Dialectal Arabic (DA) to...