thesisTerm co-occurrence data has been extensively used in many applications ranging from information retrieval to word sense disambiguation. There are two major limitations of co-occurrence data. The first limitation is known as the data sparseness problem or the zero frequency problem: For a majority of pairs, the probability that they co-occur in even a large corpus is very small. The second limitation is that in co-occurrence data, each term is considered as a meaningless symbol, or in other words, terms do not have types, or any semantic relationships with other terms. In this paper, we introduce a novel approach to address these two limitations. We create concept aware co-occurrence data wherein each term is not a symbol, but an entry...
This work introduces a strategy for estimating the semantic likeness between ideas in Knowledge Grap...
The use of topical information has long been studied in the context of information retrieval. For ex...
This thesis aims to address the current limitations in short texts clustering and provides a systema...
The present research focuses on the study of the distribution of lexis in corpus and its aim is to i...
This position paper presents a comparative study of co-occurrences. Some similarities and difference...
The spread and abundance of electronic documents requires automatic techniques for extracting useful...
Abstract. The spread and abundance of electronic documents requires automatic techniques for extract...
This thesis addresses the tasks of concept disambiguation and clustering. Concept disambiguation is ...
Aside from syntax, linguistic knowledge can be separated into two distinct parts, encyclopedic knowl...
Topic models are known to suffer from sparsity when applied to short text data. The problem is cause...
International audienceA computational model of the construction of word meaning through exposure to ...
We provide a simple and general solution for the discovery of scarce topics in unbalanced short-text...
Journal ArticleMany algorithms extract terms from text together with some kind of taxonomic classif...
This paper describes the National Research Council (NRC) Word Sense Disambiguation (WSD) system, as ...
Published as Coyote Papers: Working Papers in Linguistics, Language in Cognitive ScienceThe paper pr...
This work introduces a strategy for estimating the semantic likeness between ideas in Knowledge Grap...
The use of topical information has long been studied in the context of information retrieval. For ex...
This thesis aims to address the current limitations in short texts clustering and provides a systema...
The present research focuses on the study of the distribution of lexis in corpus and its aim is to i...
This position paper presents a comparative study of co-occurrences. Some similarities and difference...
The spread and abundance of electronic documents requires automatic techniques for extracting useful...
Abstract. The spread and abundance of electronic documents requires automatic techniques for extract...
This thesis addresses the tasks of concept disambiguation and clustering. Concept disambiguation is ...
Aside from syntax, linguistic knowledge can be separated into two distinct parts, encyclopedic knowl...
Topic models are known to suffer from sparsity when applied to short text data. The problem is cause...
International audienceA computational model of the construction of word meaning through exposure to ...
We provide a simple and general solution for the discovery of scarce topics in unbalanced short-text...
Journal ArticleMany algorithms extract terms from text together with some kind of taxonomic classif...
This paper describes the National Research Council (NRC) Word Sense Disambiguation (WSD) system, as ...
Published as Coyote Papers: Working Papers in Linguistics, Language in Cognitive ScienceThe paper pr...
This work introduces a strategy for estimating the semantic likeness between ideas in Knowledge Grap...
The use of topical information has long been studied in the context of information retrieval. For ex...
This thesis aims to address the current limitations in short texts clustering and provides a systema...