A natural evaluation metric for statistical topic models is the probability of held-out documents given a trained model. While exact computation of this probability is intractable due to the large number of discrete latent variables, several estimators for this probability have been used in the topic modeling literature, including the harmonic mean method and empirical likelihood method. In this paper, we demonstrate experimentally that commonly-used methods are unlikely to accurately estimate the probability of unseen documents, and propose two alternative methods that are both accurate and efficient
Probabilistic topic models are a popular tool for the unsupervised analysis of text, providing both ...
Bayesian inference methods for probabilistic topic models can quantify uncertainty in the parameters...
The proliferation of large electronic document archives requires new techniques for automatically an...
A natural evaluation metric for statistical topic models is the probability of held-out documents gi...
Topic models are a discrete analogue to principle component analysis and independent component analy...
We consider the problem of evaluating the predictive log likelihood of a previously un-seen document...
Statistical topic models such as latent Dirich-let allocation have become enormously popu-lar in the...
Recently, there has been considerable progress on designing algorithms with provable guarantees - ty...
Topic models are unsupervised techniques that extract likely topics from text corpora, by creating p...
Topic models can learn topics that are highly interpretable, semantically-coherent and can be used s...
This article describes posterior maximization for topic models, identifying computational and concep...
Topic models are widely used unsupervised models capable of learning topics – weighted lists of word...
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
With the development of computer technology and the internet, increasingly large amounts of textual ...
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
Probabilistic topic models are a popular tool for the unsupervised analysis of text, providing both ...
Bayesian inference methods for probabilistic topic models can quantify uncertainty in the parameters...
The proliferation of large electronic document archives requires new techniques for automatically an...
A natural evaluation metric for statistical topic models is the probability of held-out documents gi...
Topic models are a discrete analogue to principle component analysis and independent component analy...
We consider the problem of evaluating the predictive log likelihood of a previously un-seen document...
Statistical topic models such as latent Dirich-let allocation have become enormously popu-lar in the...
Recently, there has been considerable progress on designing algorithms with provable guarantees - ty...
Topic models are unsupervised techniques that extract likely topics from text corpora, by creating p...
Topic models can learn topics that are highly interpretable, semantically-coherent and can be used s...
This article describes posterior maximization for topic models, identifying computational and concep...
Topic models are widely used unsupervised models capable of learning topics – weighted lists of word...
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
With the development of computer technology and the internet, increasingly large amounts of textual ...
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
Probabilistic topic models are a popular tool for the unsupervised analysis of text, providing both ...
Bayesian inference methods for probabilistic topic models can quantify uncertainty in the parameters...
The proliferation of large electronic document archives requires new techniques for automatically an...