Chinese texts do not contain spaces as word separators like Eng-lish and many alphabetic languages. To use Moses to train transla-tion models, we must segment Chinese texts into sequences of Chinese words. Increasingly more software tools for Chinese segmentation are populated on the Internet in recent years. How-ever, some of these tools were trained with general texts, so might not handle domain-specific terms in patent documents very well. Some machine-learning based tools require us to provide seg-mented Chinese to train segmentation models. In both cases, providing segmented Chinese texts to refine a pre-trained model or to create a new model for segmentation is an important basis for successful Chinese-English machine translation syst...
Parallel corpus is a valuable resource for cross-language information retrieval and data-driven natu...
Unsupervised word segmentation (UWS) can provide domain-adaptive segmenta-tion for statistical machi...
This document describes the segmentation guidelines for the Penn Chinese Treebank Project. The goal ...
The Chinese language, unlike English, is written without marked word boundaries, and Chinese word se...
Almost all Chinese language processing tasks involve word segmentation of the language input as thei...
A Chinese sentence is represented as a sequence of charac-ters, and words are not separated from eac...
Previous work has shown that Chinese word seg-mentation is useful for machine translation to En-glis...
Textual information written in Chinese now represents a huge knowledge repository. The first step of...
Languages that have no explicit word de-limiters often have to be segmented for sta-tistical machine...
In the last decade, while statistical machine translation has advanced significantly, there is still...
This paper presents a Chinese word segmentation system that uses improved source-channel models of C...
Abstract. Word segmentation has been shown helpful for Chinese-to-English machine translation (MT), ...
This paper presents a Chinese word segmentation system that uses improved source-channel models of C...
This paper introduces the TRGTK’s system for Patent Ma-chine Translation at the NTCIR-10 Workshop. I...
The Chinese language is written without using spaces or other word delimiters. Although a text may b...
Parallel corpus is a valuable resource for cross-language information retrieval and data-driven natu...
Unsupervised word segmentation (UWS) can provide domain-adaptive segmenta-tion for statistical machi...
This document describes the segmentation guidelines for the Penn Chinese Treebank Project. The goal ...
The Chinese language, unlike English, is written without marked word boundaries, and Chinese word se...
Almost all Chinese language processing tasks involve word segmentation of the language input as thei...
A Chinese sentence is represented as a sequence of charac-ters, and words are not separated from eac...
Previous work has shown that Chinese word seg-mentation is useful for machine translation to En-glis...
Textual information written in Chinese now represents a huge knowledge repository. The first step of...
Languages that have no explicit word de-limiters often have to be segmented for sta-tistical machine...
In the last decade, while statistical machine translation has advanced significantly, there is still...
This paper presents a Chinese word segmentation system that uses improved source-channel models of C...
Abstract. Word segmentation has been shown helpful for Chinese-to-English machine translation (MT), ...
This paper presents a Chinese word segmentation system that uses improved source-channel models of C...
This paper introduces the TRGTK’s system for Patent Ma-chine Translation at the NTCIR-10 Workshop. I...
The Chinese language is written without using spaces or other word delimiters. Although a text may b...
Parallel corpus is a valuable resource for cross-language information retrieval and data-driven natu...
Unsupervised word segmentation (UWS) can provide domain-adaptive segmenta-tion for statistical machi...
This document describes the segmentation guidelines for the Penn Chinese Treebank Project. The goal ...