Unsupervised topic models, such as Latent Dirichlet Allocation (LDA), are widely used as automated feature engineering tools for textual data. They model words semantics based on some latent topics on the basis that semantically related words occur in similar documents. However, words weights that are assigned by these topic models do not represent the semantic meaning of these words to user information needs. In this paper, we present an innovative and effective extended random sets (ERS) model to enhance the semantic of topical words. The proposed model is used as a word weighting scheme for relevance feature selection (FS). It accurately weights words based on their appearance in the LDA latent topics and the relevant documents. The expe...
Dissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e Tecnologia ...
Abstract — Text classification has become a critical step in big data analytics. For supervised mach...
This paper studies how to incorporate the ex-ternal word correlation knowledge to improve the cohere...
It is challenging to discover relevant features from long documents that describe user information n...
Selecting features from documents that describe user information needs is challenging due to the nat...
Topic modelling methods such as Latent Dirichlet Allocation (LDA) have been successfully applied to ...
Topic modelling methods such as Latent Dirichlet Allocation (LDA) have been successfully applied to ...
This thesis presents innovative and effective feature selection models and frameworks to select and ...
Abstract-Text categorization is the task of automatically assigning unlabeled text documents to some...
Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical mod...
Most relevance feature discovery models only consider document-level evidence, which may introduce u...
Probabilistic topic models are widely used to discover latent topics in document collec-tions, while...
Most relevance feature discovery models only consider document-level evidence, which may introduce u...
We extend Latent Dirichlet Allocation (LDA) by explicitly allowing for the en-coding of side informa...
<p>It is observed that distinct words in a given document have either strong or weak ability in deli...
Dissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e Tecnologia ...
Abstract — Text classification has become a critical step in big data analytics. For supervised mach...
This paper studies how to incorporate the ex-ternal word correlation knowledge to improve the cohere...
It is challenging to discover relevant features from long documents that describe user information n...
Selecting features from documents that describe user information needs is challenging due to the nat...
Topic modelling methods such as Latent Dirichlet Allocation (LDA) have been successfully applied to ...
Topic modelling methods such as Latent Dirichlet Allocation (LDA) have been successfully applied to ...
This thesis presents innovative and effective feature selection models and frameworks to select and ...
Abstract-Text categorization is the task of automatically assigning unlabeled text documents to some...
Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical mod...
Most relevance feature discovery models only consider document-level evidence, which may introduce u...
Probabilistic topic models are widely used to discover latent topics in document collec-tions, while...
Most relevance feature discovery models only consider document-level evidence, which may introduce u...
We extend Latent Dirichlet Allocation (LDA) by explicitly allowing for the en-coding of side informa...
<p>It is observed that distinct words in a given document have either strong or weak ability in deli...
Dissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e Tecnologia ...
Abstract — Text classification has become a critical step in big data analytics. For supervised mach...
This paper studies how to incorporate the ex-ternal word correlation knowledge to improve the cohere...