Currently, exist a large amount of news in a digital format needs to be classified or labeled automatically according to its content. Latent Dirichlet Allocation (LDA) is an unsupervised technique that automatically creates topics based on words in documents. The present work aims to apply LDA in order to analyze and extract topics from digital news in the Spanish language. A total of 198 digital news were collected from a university news blog. A data pre-processing and representation in vector spaces were carried out and k values were selected based on the coherence metric. A term frequency-inverse document frequency (TF_IDF) matrix and a combination of unigrams and bigrams produce topics with a variety of terms and topics related to unive...
Since Internet was born most people can access fully free to a lot sources of information. Every day...
Since Internet was born most people can access fully free to a lot sources of information. Every day...
We describe the methodology that we followed to automatically extract topics corresponding to known ...
Currently exist a large amount of news in a digital format that need to be classified or labeled aut...
Currently exist a large amount of news in a digital format that need to be classified or labeled aut...
Currently exist a large amount of news in a digital format that need to be classified or labeled aut...
Currently exist a large amount of news in a digital format that need to be classified or labeled aut...
This paper is in the field of natural language processing. It applied unsupervised machine learning ...
Recently, a probabilistic topic modelling approach, latent dirichlet allocation (LDA), has been exte...
The amount of News displayed on online news portals. Often does not indicate the topic being discuss...
This work aims at discovering topics in a text corpus and classifying the most relevant terms for ea...
A massive number of news articles leads to the potential problem in automatic classifi...
The amount of News displayed on online news portals. Often does not indicate the topic being discuss...
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requ...
Innovations in Intelligent Systems and Applications Conference (ASYU) -- OCT 04-06, 2018 -- Adana, T...
Since Internet was born most people can access fully free to a lot sources of information. Every day...
Since Internet was born most people can access fully free to a lot sources of information. Every day...
We describe the methodology that we followed to automatically extract topics corresponding to known ...
Currently exist a large amount of news in a digital format that need to be classified or labeled aut...
Currently exist a large amount of news in a digital format that need to be classified or labeled aut...
Currently exist a large amount of news in a digital format that need to be classified or labeled aut...
Currently exist a large amount of news in a digital format that need to be classified or labeled aut...
This paper is in the field of natural language processing. It applied unsupervised machine learning ...
Recently, a probabilistic topic modelling approach, latent dirichlet allocation (LDA), has been exte...
The amount of News displayed on online news portals. Often does not indicate the topic being discuss...
This work aims at discovering topics in a text corpus and classifying the most relevant terms for ea...
A massive number of news articles leads to the potential problem in automatic classifi...
The amount of News displayed on online news portals. Often does not indicate the topic being discuss...
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requ...
Innovations in Intelligent Systems and Applications Conference (ASYU) -- OCT 04-06, 2018 -- Adana, T...
Since Internet was born most people can access fully free to a lot sources of information. Every day...
Since Internet was born most people can access fully free to a lot sources of information. Every day...
We describe the methodology that we followed to automatically extract topics corresponding to known ...