National audienceThe main work in bilingual lexicon extraction from comparable corpora is based on the implicit hypothesis that corpora are balanced. However, the different related approaches are relatively insensitive to sizes of each part of the comparable corpus. Within this context, we study the influence of unbalanced comparable corpora on the quality of bilingual terminology extraction through different experiments. Our results show the conditions under which the use of an unbalanced comparable corpus can induce a significant gain in the quality of extracted lexicons
Thematic comparable corpora regroup texts from a same topic and written in several languages, highly...
Cet article a été publié dans la revue 'Traitement Automatique des Langues', Vol (47) n° 1 : 113-136...
Financement : projet ANR Metricc (subvention ANR-08-CORD-013), ANRT (CIFRE n° 2010/270), société Lin...
International audienceBilingual lexicon extraction from comparable corpora gives good results for la...
National audienceNous étudions dans cet article le problème de la comparabilité des documents compos...
International audienceThe main work in bilingual lexicon extraction from comparable corpora is based...
International audienceWe study in this chapter the problem of measuring the degree of comparability ...
Les corpus bilingues sont des ressources essentielles pour s'affranchir de la barrière de la langue ...
International audienceThe main work in bilingual lexicon extraction from comparable corpora is based...
Bilingual corpora are an essential resource used to cross the language barrier in multilingual Natur...
International audienceIn this article, we present a simple and effective approach for extracting bil...
International audienceComparable corpora are the main alternative to the use of parallel corpora to ...
National audienceNous présentons dans cet article une nouvelle manière d'aborder le problème de l'ac...
Our work concerns the automatic extraction of a list of aligned terms with their translations (i.e. ...
Bilingual lexicons are central components of machine translation and cross-lingual information retri...
Thematic comparable corpora regroup texts from a same topic and written in several languages, highly...
Cet article a été publié dans la revue 'Traitement Automatique des Langues', Vol (47) n° 1 : 113-136...
Financement : projet ANR Metricc (subvention ANR-08-CORD-013), ANRT (CIFRE n° 2010/270), société Lin...
International audienceBilingual lexicon extraction from comparable corpora gives good results for la...
National audienceNous étudions dans cet article le problème de la comparabilité des documents compos...
International audienceThe main work in bilingual lexicon extraction from comparable corpora is based...
International audienceWe study in this chapter the problem of measuring the degree of comparability ...
Les corpus bilingues sont des ressources essentielles pour s'affranchir de la barrière de la langue ...
International audienceThe main work in bilingual lexicon extraction from comparable corpora is based...
Bilingual corpora are an essential resource used to cross the language barrier in multilingual Natur...
International audienceIn this article, we present a simple and effective approach for extracting bil...
International audienceComparable corpora are the main alternative to the use of parallel corpora to ...
National audienceNous présentons dans cet article une nouvelle manière d'aborder le problème de l'ac...
Our work concerns the automatic extraction of a list of aligned terms with their translations (i.e. ...
Bilingual lexicons are central components of machine translation and cross-lingual information retri...
Thematic comparable corpora regroup texts from a same topic and written in several languages, highly...
Cet article a été publié dans la revue 'Traitement Automatique des Langues', Vol (47) n° 1 : 113-136...
Financement : projet ANR Metricc (subvention ANR-08-CORD-013), ANRT (CIFRE n° 2010/270), société Lin...