Sets of lexical items sharing a significant aspect of their meaning (concepts) are fun-damental in linguistics and NLP. Manual concept compilation is labor intensive, er-ror prone and subjective. We present a web-based concept extension algorithm. Given a set of terms specifying a concept in some language, we translate them to a wide range of intermediate languages, disambiguate the translations using web counts, and discover additional concept terms using symmetric patterns. We then translate the discovered terms back into the original language, score them, and ex-tend the original concept by adding back-translations having high scores. We eval-uate our method in 3 source languages and 45 intermediate languages, using both hu-man judgments...
This paper presents new methodology towards the automatic development of multilingual Web portal for...
Many algorithms extract terms from text to-gether with some kind of taxonomic clas-sification (is-a)...
Word lists that contain closely related sets of words is a critical requirement in machine understan...
We present a method which, given a few words defining a concept in some lan-guage, retrieves, disamb...
The Internet is experiencing an explosion of information presented in different languages. Though wr...
To enable concept-based cross-lingual text retrieval (CLTR) using multilingual text mining, our appr...
In previous work, we found that a great deal of information about noun attributes can be extracted f...
Terminologists scan large amounts of specialized texts to discover the terms for the concepts in a g...
Multilingual lexical resources provide information about linguistic relation of words in-between lan...
This paper introduces DUCKS, Data Unified Conceptual Knowledge Sets, as a tool for aligning lexical ...
. The objective of this work is to investigate methods by which word senses in a variety of differen...
Recent advances in generating monolingual word embeddings based on word co-occurrence for universal ...
Machine learning about language can be improved by supplying it with specific knowledge and sources ...
AbstractA knowledge base for real-world language processing applications should consist of a large b...
ABSTRACT- In this paper we present a method for mining the Web in order to extract lexical patterns ...
This paper presents new methodology towards the automatic development of multilingual Web portal for...
Many algorithms extract terms from text to-gether with some kind of taxonomic clas-sification (is-a)...
Word lists that contain closely related sets of words is a critical requirement in machine understan...
We present a method which, given a few words defining a concept in some lan-guage, retrieves, disamb...
The Internet is experiencing an explosion of information presented in different languages. Though wr...
To enable concept-based cross-lingual text retrieval (CLTR) using multilingual text mining, our appr...
In previous work, we found that a great deal of information about noun attributes can be extracted f...
Terminologists scan large amounts of specialized texts to discover the terms for the concepts in a g...
Multilingual lexical resources provide information about linguistic relation of words in-between lan...
This paper introduces DUCKS, Data Unified Conceptual Knowledge Sets, as a tool for aligning lexical ...
. The objective of this work is to investigate methods by which word senses in a variety of differen...
Recent advances in generating monolingual word embeddings based on word co-occurrence for universal ...
Machine learning about language can be improved by supplying it with specific knowledge and sources ...
AbstractA knowledge base for real-world language processing applications should consist of a large b...
ABSTRACT- In this paper we present a method for mining the Web in order to extract lexical patterns ...
This paper presents new methodology towards the automatic development of multilingual Web portal for...
Many algorithms extract terms from text to-gether with some kind of taxonomic clas-sification (is-a)...
Word lists that contain closely related sets of words is a critical requirement in machine understan...