Topic modeling is a widely used approach to analyzing large text collections. A small number of multilingual topic models have recently been explored to discover latent topics among parallel or comparable documents, such as in Wikipedia. Other topic models that were originally proposed for structured data are also applicable to multilingual documents. Correspondence Latent Dirichlet Allocation (CorrLDA) is one such model; however, it requires a pivot language to be speci- fied in advance. We propose a new topic model, Symmetric Correspondence LDA (SymCorrLDA), that incorporates a hidden variable to control a pivot language, in an extension of CorrLDA. We experimented with two multilingual comparable datasets extracted from Wikipedia and dem...
In this paper, we study different applications of cross-language latent topic models trained on comp...
Topic Models like Latent Dirichlet Allocation have been widely used for their robustness in estimati...
Code-switched documents are common in social media, providing evidence for polylingual topic models ...
Abstract Topic modeling is a widely used approach to analyzing large text collections. A small numbe...
<p>Topic modeling is a widely used approach to analyzing large text collections. A small number of m...
Topic model aims to analyze collection of documents and has been widely used in the fields of machin...
We study the problem of extracting cross-lingual topics from non-parallel multilingual text datasets...
We extend Latent Dirichlet Allocation (LDA) by explicitly allowing for the en-coding of side informa...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
Topic models like latent Dirichlet allocation (LDA) provide a framework for analyzing large datasets...
Topic models are a useful tool for analyzing large text collections, but have previously been applie...
Abstract. In this paper, we present the Polylingual Labeled Topic Model, a model which combines the ...
A topic model outputs a set of multinomial distributions over words for each topic. In this paper, w...
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requ...
Abstract. This paper explores bridging the content of two different languages via latent topics. Spe...
In this paper, we study different applications of cross-language latent topic models trained on comp...
Topic Models like Latent Dirichlet Allocation have been widely used for their robustness in estimati...
Code-switched documents are common in social media, providing evidence for polylingual topic models ...
Abstract Topic modeling is a widely used approach to analyzing large text collections. A small numbe...
<p>Topic modeling is a widely used approach to analyzing large text collections. A small number of m...
Topic model aims to analyze collection of documents and has been widely used in the fields of machin...
We study the problem of extracting cross-lingual topics from non-parallel multilingual text datasets...
We extend Latent Dirichlet Allocation (LDA) by explicitly allowing for the en-coding of side informa...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
Topic models like latent Dirichlet allocation (LDA) provide a framework for analyzing large datasets...
Topic models are a useful tool for analyzing large text collections, but have previously been applie...
Abstract. In this paper, we present the Polylingual Labeled Topic Model, a model which combines the ...
A topic model outputs a set of multinomial distributions over words for each topic. In this paper, w...
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requ...
Abstract. This paper explores bridging the content of two different languages via latent topics. Spe...
In this paper, we study different applications of cross-language latent topic models trained on comp...
Topic Models like Latent Dirichlet Allocation have been widely used for their robustness in estimati...
Code-switched documents are common in social media, providing evidence for polylingual topic models ...