In cross-language information retrieval it is often important to align words that are similar in meaning in two corpora writ-ten in different languages. Previous re-search shows that using context similar-ity to align words is helpful when no dictionary entry is available. We sug-gest a new method which selects a sub-set of words (pivot words) associated with a query and then matches these words across languages. To detect word associa-tions, we demonstrate that a new Bayesian method for estimating Point-wise Mutual Information provides improved accuracy. In the second step, matching is done in a novel way that calculates the chance of an accidental overlap of pivot words us-ing the hypergeometric distribution. We implemented a wide variety...
Selection of the most suitable translation among all translation candidates returned by bilingual di...
Existing word similarity measures are not robust to data sparseness since they rely only on the poin...
The limited coverage of available translation lexicons can pose a se-rious challenge in some cross-l...
Current methods for word alignment require considerable amounts of parallel text to deliver accurate...
Bilingual or even polylingual word embeddings created many possibilities for tasks involving multipl...
We propose a new approach to identifying semantically similar words across languages. The approach i...
Amsterdam: IOS Pressidentify translation equivalents at the word- or phrase level. Such techniques c...
Automatically compiling bilingual dictio-naries of technical terms from comparable corpora is a chal...
Cross-language information retrieval concerns the problem of finding information in one language in ...
The limited coverage of available translation lexicons can pose a se-rious challenge in some cross-l...
In this chapter, we focus on the specific problem of sentence alignment given two comparable corpora...
Using comparable corpora to find new word translations is a promising approach for ex-tending biling...
To retrieve documents written in different languages is necessary to construct parallel documents. C...
There is an increasing need for document search mechanisms capable of matching a natural language qu...
Cross-lingual information retrieval is a difficult task typically involving query translation into m...
Selection of the most suitable translation among all translation candidates returned by bilingual di...
Existing word similarity measures are not robust to data sparseness since they rely only on the poin...
The limited coverage of available translation lexicons can pose a se-rious challenge in some cross-l...
Current methods for word alignment require considerable amounts of parallel text to deliver accurate...
Bilingual or even polylingual word embeddings created many possibilities for tasks involving multipl...
We propose a new approach to identifying semantically similar words across languages. The approach i...
Amsterdam: IOS Pressidentify translation equivalents at the word- or phrase level. Such techniques c...
Automatically compiling bilingual dictio-naries of technical terms from comparable corpora is a chal...
Cross-language information retrieval concerns the problem of finding information in one language in ...
The limited coverage of available translation lexicons can pose a se-rious challenge in some cross-l...
In this chapter, we focus on the specific problem of sentence alignment given two comparable corpora...
Using comparable corpora to find new word translations is a promising approach for ex-tending biling...
To retrieve documents written in different languages is necessary to construct parallel documents. C...
There is an increasing need for document search mechanisms capable of matching a natural language qu...
Cross-lingual information retrieval is a difficult task typically involving query translation into m...
Selection of the most suitable translation among all translation candidates returned by bilingual di...
Existing word similarity measures are not robust to data sparseness since they rely only on the poin...
The limited coverage of available translation lexicons can pose a se-rious challenge in some cross-l...