Previous work has shown that Chinese word seg-mentation is useful for machine translation to En-glish, yet the way different segmentation strategies affect MT is still poorly understood. In this pa-per, we demonstrate that optimizing segmentation for an existing segmentation standard does not al-ways yield better MT performance. We find that other factors such as segmentation consistency and granularity of Chinese “words ” can be more impor-tant for machine translation. Based on these find-ings, we implement methods inside a conditional random field segmenter that directly optimize seg-mentation granularity with respect to the MT task, providing an improvement of 0.73 BLEU. We also show that improving segmentation consistency us-ing externa...
SYSTRAN’s Chinese word segmentation is one important component of its Chinese-English machine transl...
The Chinese language, unlike some western languages, is written without a space between any two word...
Unsupervised word segmentation (UWS) can provide domain-adaptive segmenta-tion for statistical machi...
Previous work has shown that Chinese word seg-mentation is useful for machine translation to En-glis...
Abstract. Word segmentation has been shown helpful for Chinese-to-English machine translation (MT), ...
The Chinese language, unlike English, is written without marked word boundaries, and Chinese word se...
A Chinese sentence is represented as a sequence of charac-ters, and words are not separated from eac...
Chinese word segmentation (CWS) is a necessary step in Chinese-English statisti-cal machine translat...
Languages that have no explicit word de-limiters often have to be segmented for sta-tistical machine...
Unknown words and word segmentation granularity are two main problems in Chinese word segmentation f...
Almost all Chinese language processing tasks involve word segmentation of the language input as thei...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
Chinese texts do not contain spaces as word separators like Eng-lish and many alphabetic languages. ...
In the last decade, while statistical machine translation has advanced significantly, there is still...
Word segmentation is helpful in Chinese nat-ural language processing in many aspects. However it is ...
SYSTRAN’s Chinese word segmentation is one important component of its Chinese-English machine transl...
The Chinese language, unlike some western languages, is written without a space between any two word...
Unsupervised word segmentation (UWS) can provide domain-adaptive segmenta-tion for statistical machi...
Previous work has shown that Chinese word seg-mentation is useful for machine translation to En-glis...
Abstract. Word segmentation has been shown helpful for Chinese-to-English machine translation (MT), ...
The Chinese language, unlike English, is written without marked word boundaries, and Chinese word se...
A Chinese sentence is represented as a sequence of charac-ters, and words are not separated from eac...
Chinese word segmentation (CWS) is a necessary step in Chinese-English statisti-cal machine translat...
Languages that have no explicit word de-limiters often have to be segmented for sta-tistical machine...
Unknown words and word segmentation granularity are two main problems in Chinese word segmentation f...
Almost all Chinese language processing tasks involve word segmentation of the language input as thei...
We introduce a bilingually motivated word segmentation approach to languages where word boundaries a...
Chinese texts do not contain spaces as word separators like Eng-lish and many alphabetic languages. ...
In the last decade, while statistical machine translation has advanced significantly, there is still...
Word segmentation is helpful in Chinese nat-ural language processing in many aspects. However it is ...
SYSTRAN’s Chinese word segmentation is one important component of its Chinese-English machine transl...
The Chinese language, unlike some western languages, is written without a space between any two word...
Unsupervised word segmentation (UWS) can provide domain-adaptive segmenta-tion for statistical machi...