We present the design and implementation of GLDA, a library that utilizes the GPU (Graphics Processing Unit) to perform Gibbs sampling of Latent Dirichlet Allocation (LDA) on a single machine. LDA is an effective topic model used in many applications, e.g., classification, feature selection, and information retrieval. However, training an LDA model on large data sets takes hours, even days, due to the heavy computation and intensive memory access. Therefore, we explore the use of the GPU to accelerate LDA training on a single machine. Specifically, we propose three memory-efficient techniques to handle large data sets on the GPU: (1) generating document-topic counts as needed instead of storing all of them, (2) adopting a compact storage sc...
Latent Dirichlet Allocation (LDA) is a probability model for grouping hidden topics in documents by ...
The invention of deep belief network (DBN) provides a powerful tool for data modeling. The key advan...
In this paper, we propose an acceleration of collapsed variational Bayesian (CVB) inference for late...
Abstract. We present the design and implementation of GLDA, a library that utilizes the GPU (Graphic...
The recent emergence of Graphics Processing Units (GPUs) as general-purpose parallel computing devic...
Disk-based algorithms have the ability to process large-scale data which do not fit into the memory,...
Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using S...
The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to its uniq...
When building large-scale machine learning (ML) programs, such as big topic models or deep neural ne...
The recent dramatic progress in machine learning is partially attributed to the availability of high...
Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
Abstract. One of the major research trends currently is the evolution of heterogeneous parallel comp...
Deep learning is an emerging workload in the field of HPC. This powerful method of resolution is abl...
Editor: We describe distributed algorithms for two widely-used topic models, namely the Latent Diric...
Latent Dirichlet Allocation (LDA) is a probability model for grouping hidden topics in documents by ...
The invention of deep belief network (DBN) provides a powerful tool for data modeling. The key advan...
In this paper, we propose an acceleration of collapsed variational Bayesian (CVB) inference for late...
Abstract. We present the design and implementation of GLDA, a library that utilizes the GPU (Graphic...
The recent emergence of Graphics Processing Units (GPUs) as general-purpose parallel computing devic...
Disk-based algorithms have the ability to process large-scale data which do not fit into the memory,...
Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using S...
The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to its uniq...
When building large-scale machine learning (ML) programs, such as big topic models or deep neural ne...
The recent dramatic progress in machine learning is partially attributed to the availability of high...
Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
Abstract. One of the major research trends currently is the evolution of heterogeneous parallel comp...
Deep learning is an emerging workload in the field of HPC. This powerful method of resolution is abl...
Editor: We describe distributed algorithms for two widely-used topic models, namely the Latent Diric...
Latent Dirichlet Allocation (LDA) is a probability model for grouping hidden topics in documents by ...
The invention of deep belief network (DBN) provides a powerful tool for data modeling. The key advan...
In this paper, we propose an acceleration of collapsed variational Bayesian (CVB) inference for late...