Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of training documents requires approximate inference tech-niques that are computationally expensive. With today’s large-scale, constantly expanding document collections, it is useful to be able to infer topic distributions for new doc-uments without retraining the model. In this paper, we empirically evaluate the performance of several methods for topic inference in previously unseen documents, including methods based on Gibbs sampling, variational inference, and a new method inspired by text classification. The classification-based inference method produces results ...
Recently, there has been considerable progress on designing algorithms with provable guarantees - ty...
Topic models provide a useful method for dimensionality reduction and exploratory data analysis in l...
It is estimated that the world’s data will increase to roughly 160 billion terabytes by 2025, with m...
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
Inference in topic models typically involves a sampling step to associate latent variables with obse...
ABSTRACT Inference in topic models typically involves a sampling step to associate latent variables ...
Latent Dirichlet analysis, or topic modeling, is a flexible latent variable framework for modeling h...
Abstract Weak topic correlation across document collections with different numbers of topics in indi...
We present a hybrid algorithm for Bayesian topic models that combines the efficiency of sparse Gibbs...
There has been an explosion in the amount of digital text information available in recent years, lea...
Latent Dirichlet Allocation (LDA) is a popular topic modeling tech-nique for exploring document coll...
Logistic-normal topic models can effectively discover correlation structures among latent topics. Ho...
<p>Topic models, and more specifically the class of latent Dirichlet allocation (LDA), are widely us...
The tremendous increase in the amount of available research documents impels researchers to propose ...
Topic modeling is an unsupervised learning task that discovers the hidden topics in a ...
Recently, there has been considerable progress on designing algorithms with provable guarantees - ty...
Topic models provide a useful method for dimensionality reduction and exploratory data analysis in l...
It is estimated that the world’s data will increase to roughly 160 billion terabytes by 2025, with m...
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
Inference in topic models typically involves a sampling step to associate latent variables with obse...
ABSTRACT Inference in topic models typically involves a sampling step to associate latent variables ...
Latent Dirichlet analysis, or topic modeling, is a flexible latent variable framework for modeling h...
Abstract Weak topic correlation across document collections with different numbers of topics in indi...
We present a hybrid algorithm for Bayesian topic models that combines the efficiency of sparse Gibbs...
There has been an explosion in the amount of digital text information available in recent years, lea...
Latent Dirichlet Allocation (LDA) is a popular topic modeling tech-nique for exploring document coll...
Logistic-normal topic models can effectively discover correlation structures among latent topics. Ho...
<p>Topic models, and more specifically the class of latent Dirichlet allocation (LDA), are widely us...
The tremendous increase in the amount of available research documents impels researchers to propose ...
Topic modeling is an unsupervised learning task that discovers the hidden topics in a ...
Recently, there has been considerable progress on designing algorithms with provable guarantees - ty...
Topic models provide a useful method for dimensionality reduction and exploratory data analysis in l...
It is estimated that the world’s data will increase to roughly 160 billion terabytes by 2025, with m...