The aim of this bachelor thesis is to compare and empirically test the use of classification to improve the topic models Latent Dirichlet Allocation (LDA) and Author Topic Modeling (ATM) in the context of the social media platform Twitter. For this purpose, a corpus was classified with the Dewey Decimal Classification (DDC) and then used to train the topic models. A second dataset, the unclassified corpus, was used for comparison. The assumption that the use of classification could improve the topic models did not prove true for the LDA topic model. Here, a sufficiently good improvement of the models could not be achieved. The ATM model, on the other hand, could be improved by using the classification. In general, the ATM model performed s...
Latent topic analysis has emerged as one of the most effective methods for classifying, clustering a...
As a quantitative text analytic method, Latent Dirichlet Allocation (LDA) topic modeling has been wi...
In this paper, I apply latent dirichlet allocation(LDA) to cluster 100,000 health related articles u...
Latent Dirichlet allocation (LDA) is a topic model that has been applied to var-ious fields, includi...
Providing high quality of topics inference in today's large and dynamic corpora, such as Twitter, is...
This paper is in the field of natural language processing. It applied unsupervised machine learning ...
The rise of social media analysis is currently providing a new requirement. We are required to concl...
In Indonesia, Twitter is one of the most widely used social media platforms. Because of the diverse ...
Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and...
Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on ...
This work aims at discovering topics in a text corpus and classifying the most relevant terms for ea...
Twitter has become an essential medium for probing differing views on issues within society. One suc...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
This research project aims to provide a clear and concise guide to latent dirichlet allocation which...
Latent Dirichlet Allocation (LDA) has become the most stable and widely used topic model to derive t...
Latent topic analysis has emerged as one of the most effective methods for classifying, clustering a...
As a quantitative text analytic method, Latent Dirichlet Allocation (LDA) topic modeling has been wi...
In this paper, I apply latent dirichlet allocation(LDA) to cluster 100,000 health related articles u...
Latent Dirichlet allocation (LDA) is a topic model that has been applied to var-ious fields, includi...
Providing high quality of topics inference in today's large and dynamic corpora, such as Twitter, is...
This paper is in the field of natural language processing. It applied unsupervised machine learning ...
The rise of social media analysis is currently providing a new requirement. We are required to concl...
In Indonesia, Twitter is one of the most widely used social media platforms. Because of the diverse ...
Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and...
Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on ...
This work aims at discovering topics in a text corpus and classifying the most relevant terms for ea...
Twitter has become an essential medium for probing differing views on issues within society. One suc...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
This research project aims to provide a clear and concise guide to latent dirichlet allocation which...
Latent Dirichlet Allocation (LDA) has become the most stable and widely used topic model to derive t...
Latent topic analysis has emerged as one of the most effective methods for classifying, clustering a...
As a quantitative text analytic method, Latent Dirichlet Allocation (LDA) topic modeling has been wi...
In this paper, I apply latent dirichlet allocation(LDA) to cluster 100,000 health related articles u...