Disk-based algorithms have the ability to process large-scale data which do not fit into the memory, so they provide good scalability to a mobile device with limited memory resources. In general, the speed of disk I/O is much slower than that of memory access, the total amount of disk I/O is the most crucial factor which determines the efficiency of disk-based algorithms. This paper proposes BlockLDA, an efficient disk-based Latent Dirichlet Allocation (LDA) inference algorithm which can efficiently infer an LDA model when both of the data and model do not fit into the memory. BlockLDA manages the data and model as a set of small blocks so that it can support efficient disk I/O as well as process the LDA inference in a block-wise manner. In...
Topic models such as Latent Dirichlet Allocation (LDA) have been widely used in information retrieva...
Topic modeling is a generalization of clustering that posits that observations (words in a document)...
In this paper, we propose an acceleration of collapsed variational Bayesian (CVB) inference for late...
We present the design and implementation of GLDA, a library that utilizes the GPU (Graphics Processi...
Latent Dirichlet allocation (LDA) is a widely-used probabilistic topic modeling tool for content ana...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
Latent Dirichlet Allocation (LDA) is a probability model for grouping hidden topics in documents by ...
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requ...
We describe distributed algorithms for two widely-used topic models, namely the Latent Dirichlet All...
Despite many years of research into latent Dirichlet allocation (LDA), applying LDA to collections o...
Abstract Background: Unstructured and textual data is increasing rapidly and Latent Dirichlet Alloca...
Statistical topic models such as the Latent Dirichlet Allocation (LDA) have emerged as an attractive...
Topic modeling is a generalization of clustering that posits that observations (words in a document)...
There has been an explosion in the amount of digital text information available in recent years, lea...
Latent Dirichlet Allocation (LDA) is a popular machine-learning technique that identifies latent str...
Topic models such as Latent Dirichlet Allocation (LDA) have been widely used in information retrieva...
Topic modeling is a generalization of clustering that posits that observations (words in a document)...
In this paper, we propose an acceleration of collapsed variational Bayesian (CVB) inference for late...
We present the design and implementation of GLDA, a library that utilizes the GPU (Graphics Processi...
Latent Dirichlet allocation (LDA) is a widely-used probabilistic topic modeling tool for content ana...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
Latent Dirichlet Allocation (LDA) is a probability model for grouping hidden topics in documents by ...
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requ...
We describe distributed algorithms for two widely-used topic models, namely the Latent Dirichlet All...
Despite many years of research into latent Dirichlet allocation (LDA), applying LDA to collections o...
Abstract Background: Unstructured and textual data is increasing rapidly and Latent Dirichlet Alloca...
Statistical topic models such as the Latent Dirichlet Allocation (LDA) have emerged as an attractive...
Topic modeling is a generalization of clustering that posits that observations (words in a document)...
There has been an explosion in the amount of digital text information available in recent years, lea...
Latent Dirichlet Allocation (LDA) is a popular machine-learning technique that identifies latent str...
Topic models such as Latent Dirichlet Allocation (LDA) have been widely used in information retrieva...
Topic modeling is a generalization of clustering that posits that observations (words in a document)...
In this paper, we propose an acceleration of collapsed variational Bayesian (CVB) inference for late...