We report experimental results on automatic extraction of an English-Chinese translation lexicon, by statistical analysis of a large parallel corpus, using limited amounts of linguistic knowledge. To our knowledge, these are the first empirical results of the kind between an Indo-European and non-Indo-European language for any significant vocabulary and corpus size. The learned vocabulary size is about 6,500 English words, achieving translation precision in the 86-96% range, with alignment proceeding at paragraph, sentence, and word levels. Specifically, we report (1) progress on the HKUST English-Chinese Parallel Bilingual Corpus, (2) experiments supporting the usefulness of restricted lexical cues for statistical paragraph and sentence al...
We propose a novel context heterogeneity similarity measure between words and their translations in ...
This paper describes a system that automatically mines English-Chinese translation pairs from large ...
In this paper, we propose a new method for extracting bilingual collocations from a parallel corpus ...
We report experiments on automatic learning of an English-Chinese translation lexicon, through stati...
This paper presents a hybrid approach to deriving a translation lexicon from unaligned parallel Chin...
We propose a novel statistical translation model to improve translation selection of collocation. In...
Parallel corpus is a valuable resource for cross-language information retrieval and data-driven natu...
This paper first describes an experiment to construct an English-Chinese parallel corpus, then apply...
The process of constructing translation lexicons from parallel texts (bitexts) can be broken down in...
The paper reports on a series of experiments to extract matching lexical items from a 6.1 million se...
An automated approach of extracting bilingual lexicon (or dictionary) from comparable, non-parallel ...
Demand for Chinese-to-English translation has increased over recent years. In contrast, resources fo...
We describe our experience with automatic alignment of sentences inparallel English-Chinese texts. ...
Most Chinese-English parallel corpora consist of English source texts translated into Chinese. This ...
Technical-term translation represents one of the most difficult tasks for human translators since (1...
We propose a novel context heterogeneity similarity measure between words and their translations in ...
This paper describes a system that automatically mines English-Chinese translation pairs from large ...
In this paper, we propose a new method for extracting bilingual collocations from a parallel corpus ...
We report experiments on automatic learning of an English-Chinese translation lexicon, through stati...
This paper presents a hybrid approach to deriving a translation lexicon from unaligned parallel Chin...
We propose a novel statistical translation model to improve translation selection of collocation. In...
Parallel corpus is a valuable resource for cross-language information retrieval and data-driven natu...
This paper first describes an experiment to construct an English-Chinese parallel corpus, then apply...
The process of constructing translation lexicons from parallel texts (bitexts) can be broken down in...
The paper reports on a series of experiments to extract matching lexical items from a 6.1 million se...
An automated approach of extracting bilingual lexicon (or dictionary) from comparable, non-parallel ...
Demand for Chinese-to-English translation has increased over recent years. In contrast, resources fo...
We describe our experience with automatic alignment of sentences inparallel English-Chinese texts. ...
Most Chinese-English parallel corpora consist of English source texts translated into Chinese. This ...
Technical-term translation represents one of the most difficult tasks for human translators since (1...
We propose a novel context heterogeneity similarity measure between words and their translations in ...
This paper describes a system that automatically mines English-Chinese translation pairs from large ...
In this paper, we propose a new method for extracting bilingual collocations from a parallel corpus ...