Most of the current Chinese word alignment tasks often adopt word segmentation systems firstly to identify words. However, word-mismatching problems exist between languages and will degrade the performance of word alignment. In this paper, we propose two unsupervised methods to adjust word segmentation to make the tokens 1-to-1 mapping as many as possible between the corresponding sentences. The first method is learning affix rules from a bilingual terminology bank. The second method is using the concept of impurity measure motivated by the decision tree. Our experiments showed that both of the adjusting methods improve the performance of word alignment significantly.
Languages that have no explicit word de-limiters often have to be segmented for sta-tistical machine...
We introduce a simple method to pack words for statistical word alignment. Our goal is to simplify t...
Sentence alignment is most important for Chinese-English bilingual corpus alignment. This paper anal...
Most of the current Chinese word alignment tasks often adopt word segmentation systems firstly to id...
In this paper, we present a new word alignment combination approach on language pairs where one lang...
Abstract:- In this paper, we use a lexical method to do sentence alignment for an English-Chinese co...
[[abstract]]In this paper, we use a lexical method to do sentence alignment for an English-Chinese c...
Bilingual alignment is a crucial problem in the research of natural language processing, and word al...
One of the bilingual corpus processing methods is the alignment of two languages on each linguistic ...
We introduce a word alignment framework that facilitates the incorporation of syntax en-coded in bil...
We present a novel approach to im-prove word alignment for statistical ma-chine translation (SMT). C...
We introduce a simple method to pack words for statistical word alignment. Our goal is to simplify t...
Word alignment in bilingual or multilingual parallel corpora has been a challenging issue for natura...
The fact that words are not conventionally demarcated in Chinese orthography makes the process of wo...
In this paper, a method for the word alignment of English-Chinese corpus based on chunks is proposed...
Languages that have no explicit word de-limiters often have to be segmented for sta-tistical machine...
We introduce a simple method to pack words for statistical word alignment. Our goal is to simplify t...
Sentence alignment is most important for Chinese-English bilingual corpus alignment. This paper anal...
Most of the current Chinese word alignment tasks often adopt word segmentation systems firstly to id...
In this paper, we present a new word alignment combination approach on language pairs where one lang...
Abstract:- In this paper, we use a lexical method to do sentence alignment for an English-Chinese co...
[[abstract]]In this paper, we use a lexical method to do sentence alignment for an English-Chinese c...
Bilingual alignment is a crucial problem in the research of natural language processing, and word al...
One of the bilingual corpus processing methods is the alignment of two languages on each linguistic ...
We introduce a word alignment framework that facilitates the incorporation of syntax en-coded in bil...
We present a novel approach to im-prove word alignment for statistical ma-chine translation (SMT). C...
We introduce a simple method to pack words for statistical word alignment. Our goal is to simplify t...
Word alignment in bilingual or multilingual parallel corpora has been a challenging issue for natura...
The fact that words are not conventionally demarcated in Chinese orthography makes the process of wo...
In this paper, a method for the word alignment of English-Chinese corpus based on chunks is proposed...
Languages that have no explicit word de-limiters often have to be segmented for sta-tistical machine...
We introduce a simple method to pack words for statistical word alignment. Our goal is to simplify t...
Sentence alignment is most important for Chinese-English bilingual corpus alignment. This paper anal...