We explore the use a Latent Dirichlet Allocation (LDA) imitating pseudo-topic-model, based on our original relevance metric, as a tool to facilitate distant annotation of short (often one to two sentence or less) documents. Our exploration manifests as annotating tweets for emotions, this being the current use-case of interest to us, but we believe the method could be extended to any multi-class labeling task of documents of similar length. Tweets are gathered via the Twitter API using track terms thought likely to capture tweets with a greater chance of exhibiting each emotional class, 3,000 tweets for each of 26 topics anticipated to elicit emotional discourse. Our pseudo-topic-model is used to produce relevance-ranked vocabularies for ...
With its rapid users growth, Twitter has become an essential source of information about what events...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
We describe the methodology that we followed to automatically extract topics corresponding to known ...
We explore the use a Latent Dirichlet Allocation (LDA) imitating pseudo-topic-model, based on our or...
Latent topic analysis has emerged as one of the most effective methods for classifying, clustering a...
Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on ...
Latent Dirichlet allocation (LDA) is a topic model that has been applied to var-ious fields, includi...
Texts can be characterized from their content using machine learning and natural language processing...
Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and...
Providing high quality of topics inference in today's large and dynamic corpora, such as Twitter, is...
With the rapid proliferation of social networking sites (SNS), automatic topic extraction from vario...
Notwithstanding recent work which has demonstrated the potential of using Twitter messages for conte...
Recently, there has been an exponential rise in the use of online social media systems like Twitter ...
The aim of this bachelor thesis is to compare and empirically test the use of classification to impr...
Latent topics derived by topic models such as Latent Dirichlet Allocation (LDA) are the result of hi...
With its rapid users growth, Twitter has become an essential source of information about what events...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
We describe the methodology that we followed to automatically extract topics corresponding to known ...
We explore the use a Latent Dirichlet Allocation (LDA) imitating pseudo-topic-model, based on our or...
Latent topic analysis has emerged as one of the most effective methods for classifying, clustering a...
Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on ...
Latent Dirichlet allocation (LDA) is a topic model that has been applied to var-ious fields, includi...
Texts can be characterized from their content using machine learning and natural language processing...
Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and...
Providing high quality of topics inference in today's large and dynamic corpora, such as Twitter, is...
With the rapid proliferation of social networking sites (SNS), automatic topic extraction from vario...
Notwithstanding recent work which has demonstrated the potential of using Twitter messages for conte...
Recently, there has been an exponential rise in the use of online social media systems like Twitter ...
The aim of this bachelor thesis is to compare and empirically test the use of classification to impr...
Latent topics derived by topic models such as Latent Dirichlet Allocation (LDA) are the result of hi...
With its rapid users growth, Twitter has become an essential source of information about what events...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
We describe the methodology that we followed to automatically extract topics corresponding to known ...