We introduce a bilingually motivated word segmentation approach to languages where word boundaries are not orthographically marked, with application to Phrase-Based Statistical Machine Translation (PB-SMT). Our approach is motivated from the insight that PB-SMT systems can be improved by optimizing the input representation to reduce the predictive power of translation models. We firstly present an approach to optimize the existing segmentation of both source and target languages for PB-SMT and demonstrate the effectiveness of this approach using a Chinese–English MT task, that is, to measure the influence of the segmentation on the performance of PB-SMT systems. We report a 5.44% relative increase in Bleu score and a consistent increase ac...
Aiming to overcome the shortcomings of word-based Abstract In [his Paper new algorithm called Mulfi-...
The statistical framework has proved to be very successful in machine translation. The main reason ...
Myanmar sentences are written as contiguoussequences of syllables with no characters delimiting thew...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
We introduce a word segmentation approach to languages where word boundaries are not orthographicall...
We introduce a word segmentation ap-proach to languages where word bound-aries are not orthographica...
In the last decade, while statistical machine translation has advanced significantly, there is still...
We present a novel segmentation ap-proach for Phrase-Based Statistical Ma-chine Translation (PB-SMT)...
[[abstract]]We propose a method that bilingually segments sentences in languages with no clear delim...
We present an unsupervised word segmentation model for machine translation. The model uses existing ...
Languages that have no explicit word de-limiters often have to be segmented for sta-tistical machine...
State-of-the-art statistical machine translation systems make use of a large translation table obta...
Unsupervised word segmentation (UWS) can provide domain-adaptive segmenta-tion for statistical machi...
State-of-the-art statistical machine transla-tion systems make use of a large trans-lation table obt...
We introduce a simple method to pack words for statistical word alignment. Our goal is to simplify t...
Aiming to overcome the shortcomings of word-based Abstract In [his Paper new algorithm called Mulfi-...
The statistical framework has proved to be very successful in machine translation. The main reason ...
Myanmar sentences are written as contiguoussequences of syllables with no characters delimiting thew...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
We introduce a word segmentation approach to languages where word boundaries are not orthographicall...
We introduce a word segmentation ap-proach to languages where word bound-aries are not orthographica...
In the last decade, while statistical machine translation has advanced significantly, there is still...
We present a novel segmentation ap-proach for Phrase-Based Statistical Ma-chine Translation (PB-SMT)...
[[abstract]]We propose a method that bilingually segments sentences in languages with no clear delim...
We present an unsupervised word segmentation model for machine translation. The model uses existing ...
Languages that have no explicit word de-limiters often have to be segmented for sta-tistical machine...
State-of-the-art statistical machine translation systems make use of a large translation table obta...
Unsupervised word segmentation (UWS) can provide domain-adaptive segmenta-tion for statistical machi...
State-of-the-art statistical machine transla-tion systems make use of a large trans-lation table obt...
We introduce a simple method to pack words for statistical word alignment. Our goal is to simplify t...
Aiming to overcome the shortcomings of word-based Abstract In [his Paper new algorithm called Mulfi-...
The statistical framework has proved to be very successful in machine translation. The main reason ...
Myanmar sentences are written as contiguoussequences of syllables with no characters delimiting thew...