International audienceProbabilistic topic models are generative models that describe the content of documents by discovering the latent topics underlying them. However, the structure of the textual input, and for instance the grouping of words in coherent text spans such as sentences, contains much information which is generally lost with these models. In this paper, we propose sentenceLDA, an extension of LDA whose goal is to overcome this limitation by incorporating the structure of the text in the generative and inference processes. We illustrate the advantages of sentenceLDA by comparing it with LDA using both intrinsic (perplexity) and extrinsic (text classification) evaluation tasks on different text collections
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
Due to copyright restrictions, the access to the full text of this article is only available via sub...
International audienceProbabilistic topic models are generative models that describe the content of ...
This article presents a probabilistic generative model for text based on semantic topics and syntact...
Topic Models like Latent Dirichlet Allocation have been widely used for their robustness in estimati...
Topic Models like Latent Dirichlet Allocation have been widely used for their robustness in estimati...
Topic Models like Latent Dirichlet Allocation have been widely used for their robustness in estimati...
The syntactic topic model (STM) is a Bayesian nonparametric model of language that discovers latent ...
Probabilistic topic models, such as LDA, are standard text analysis algorithms that provide predicti...
Topic modeling algorithms, such as LDA, find topics, hidden structures, in document corpora in an un...
The abundance of data in the information age poses an immense challenge for us: how to perform large...
A Topic Model is a class of generative probabilistic models which has gained widespread use in compu...
A Topic Model is a class of generative probabilistic models which has gained widespread use in compu...
This paper is in the field of natural language processing. It applied unsupervised machine learning ...
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
Due to copyright restrictions, the access to the full text of this article is only available via sub...
International audienceProbabilistic topic models are generative models that describe the content of ...
This article presents a probabilistic generative model for text based on semantic topics and syntact...
Topic Models like Latent Dirichlet Allocation have been widely used for their robustness in estimati...
Topic Models like Latent Dirichlet Allocation have been widely used for their robustness in estimati...
Topic Models like Latent Dirichlet Allocation have been widely used for their robustness in estimati...
The syntactic topic model (STM) is a Bayesian nonparametric model of language that discovers latent ...
Probabilistic topic models, such as LDA, are standard text analysis algorithms that provide predicti...
Topic modeling algorithms, such as LDA, find topics, hidden structures, in document corpora in an un...
The abundance of data in the information age poses an immense challenge for us: how to perform large...
A Topic Model is a class of generative probabilistic models which has gained widespread use in compu...
A Topic Model is a class of generative probabilistic models which has gained widespread use in compu...
This paper is in the field of natural language processing. It applied unsupervised machine learning ...
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
Topic models provide a powerful tool for analyzing large text collections by representing high dimen...
Due to copyright restrictions, the access to the full text of this article is only available via sub...