In recent years, state-of-the-art cross-linguistic systems have been based on parallel corpora. Nevertheless, it is difficult at times to find translations of a certain technical term or named entity even with a very large parallel corpora. In this paper, we present a new method for learning to find translations on the Web for a given term. In our approach, we use a small set of terms and translations to obtain mixed-code snippets returned by a search engine. We then automatically annotate the data with translation tags, automatically generate features to augment the tagged data, and automatically train a conditional random fields model for identifying translations. At runtime, we obtain mixed-code webpages containing the given term and run...
This paper describes a system that automatically mines English-Chinese translation pairs from large ...
We present a high-precision, language-independent transliteration framework applicable to bilingual ...
Parallel corpora are a crucial resource in research fields such as cross-lingual infor-mation retrie...
We introduce a method for learning to find domain-specific translations for a given term on the Web....
Documents in languages such as Chinese, Japanese and Korean sometimes annotate terms with their tran...
Documents in languages such as Chinese, Japanese and Korean sometimes annotate terms with their tran...
Contemporary machine translation systems usually rely on offline data retrieved from the web for ind...
This paper introduces a method for learn-ing to find translation of a given source term on the Web. ...
International audienceThe quality of machine translation is often dependent on the quality of lexica...
Although more and more language pairs are covered by machine translation (MT) services, there are st...
Key phrases are usually among the most information-bearing linguistic structures. Translating them c...
Although more and more language pairs are covered by machine translation (MT) services, there are st...
[[abstract]]We present a method for learning to find English to Chinese transliterations on the Web....
Mining translations from abundant Web data can be applied in many fields such as computer assisted l...
New words such as names, technical terms, etc appear frequently. As such, the bilingual lexicon of a...
This paper describes a system that automatically mines English-Chinese translation pairs from large ...
We present a high-precision, language-independent transliteration framework applicable to bilingual ...
Parallel corpora are a crucial resource in research fields such as cross-lingual infor-mation retrie...
We introduce a method for learning to find domain-specific translations for a given term on the Web....
Documents in languages such as Chinese, Japanese and Korean sometimes annotate terms with their tran...
Documents in languages such as Chinese, Japanese and Korean sometimes annotate terms with their tran...
Contemporary machine translation systems usually rely on offline data retrieved from the web for ind...
This paper introduces a method for learn-ing to find translation of a given source term on the Web. ...
International audienceThe quality of machine translation is often dependent on the quality of lexica...
Although more and more language pairs are covered by machine translation (MT) services, there are st...
Key phrases are usually among the most information-bearing linguistic structures. Translating them c...
Although more and more language pairs are covered by machine translation (MT) services, there are st...
[[abstract]]We present a method for learning to find English to Chinese transliterations on the Web....
Mining translations from abundant Web data can be applied in many fields such as computer assisted l...
New words such as names, technical terms, etc appear frequently. As such, the bilingual lexicon of a...
This paper describes a system that automatically mines English-Chinese translation pairs from large ...
We present a high-precision, language-independent transliteration framework applicable to bilingual ...
Parallel corpora are a crucial resource in research fields such as cross-lingual infor-mation retrie...