STRAND Resnik is a language independent system for automatic discovery of text in parallel translation on the World Wide Web This paper extends the prelim inary STRAND results by adding automatic language identication scaling up by orders of magnitude and formally evaluating perfor mance The most recent endproduct is an au tomatically acquired parallel corpus comprising EnglishFrench document pairs approxi mately million words per language Text in parallel translation is a valuable re source in natural language processing St
Although more and more language pairs are covered by machine translation (MT) services, there are st...
Parallel corpora are indispensable resources for a variety of multilingual natural language processi...
Multilingual resources are useful for linguistic studies, translation, and many other tasks. Unfortu...
STRAND (Resnik, 1998) is a language-independent system for automatic discovery of text in parallel t...
Parallel corpora have become an essential resource for work in multilingual natural language process...
Parallel corpora have become an essential resource for work in multilingual natural language process...
Parallel corpus are valuable resource for machine translation, multi-lingual text retrieval, languag...
In this thesis, we propose a content-based method of mining bilingual parallel documents from websit...
Parallel corpora are a valuable resource for machine translation, but at present their availability ...
This paper describes BABYLON, a system that attempts to overcome the shortage of parallel texts in l...
Parallel corpora are a crucial resource in research fields such as cross-lingual infor-mation retrie...
Abstract. Parallel corpora are playing a crucial role in multilingual natural language processing. U...
Discovering parallel corpora on the web is a challenging task. In this paper, we use cross-language ...
Title: Mining Parallel Corpora from the Web Author: Bc. Jakub Kúdela Author's e-mail address: jakub....
The majority of the world's languages are poorly represented in informational media like radio, tele...
Although more and more language pairs are covered by machine translation (MT) services, there are st...
Parallel corpora are indispensable resources for a variety of multilingual natural language processi...
Multilingual resources are useful for linguistic studies, translation, and many other tasks. Unfortu...
STRAND (Resnik, 1998) is a language-independent system for automatic discovery of text in parallel t...
Parallel corpora have become an essential resource for work in multilingual natural language process...
Parallel corpora have become an essential resource for work in multilingual natural language process...
Parallel corpus are valuable resource for machine translation, multi-lingual text retrieval, languag...
In this thesis, we propose a content-based method of mining bilingual parallel documents from websit...
Parallel corpora are a valuable resource for machine translation, but at present their availability ...
This paper describes BABYLON, a system that attempts to overcome the shortage of parallel texts in l...
Parallel corpora are a crucial resource in research fields such as cross-lingual infor-mation retrie...
Abstract. Parallel corpora are playing a crucial role in multilingual natural language processing. U...
Discovering parallel corpora on the web is a challenging task. In this paper, we use cross-language ...
Title: Mining Parallel Corpora from the Web Author: Bc. Jakub Kúdela Author's e-mail address: jakub....
The majority of the world's languages are poorly represented in informational media like radio, tele...
Although more and more language pairs are covered by machine translation (MT) services, there are st...
Parallel corpora are indispensable resources for a variety of multilingual natural language processi...
Multilingual resources are useful for linguistic studies, translation, and many other tasks. Unfortu...