Abstract. We present the design and implementation of GLDA, a library that utilizes the GPU (Graphics Processing Unit) to perform Gibbs sampling of La-tent Dirichlet Allocation (LDA) on a single machine. LDA is an effective topic model used in many applications, e.g., classification, feature selection, and infor-mation retrieval. However, training an LDA model on large data sets takes hours, even days, due to the heavy computation and intensive memory access. Therefore, we explore the use of the GPU to accelerate LDA training on a single machine. Specifically, we propose three memory-efficient techniques to handle large data sets on the GPU: (1) generating document-topic counts as needed instead of stor-ing all of them, (2) adopting a compa...
Statistical topic models such as the Latent Dirichlet Allocation (LDA) have emerged as an attractive...
Recent developments in programmable, highly par-allel Graphics Processing Units (GPUs) have enabled ...
The effectiveness of Machine Learning (ML) methods depend on access to large suitable datasets. In t...
We present the design and implementation of GLDA, a library that utilizes the GPU (Graphics Processi...
Disk-based algorithms have the ability to process large-scale data which do not fit into the memory,...
The recent emergence of Graphics Processing Units (GPUs) as general-purpose parallel computing devic...
Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using S...
The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to its uniq...
When building large-scale machine learning (ML) programs, such as big topic models or deep neural ne...
The recent dramatic progress in machine learning is partially attributed to the availability of high...
We describe distributed algorithms for two widely-used topic models, namely the Latent Dirichlet All...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
Abstract. One of the major research trends currently is the evolution of heterogeneous parallel comp...
Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent...
We present LDA*, a system that has been deployed in one of the largest Internet companies to fulfil ...
Statistical topic models such as the Latent Dirichlet Allocation (LDA) have emerged as an attractive...
Recent developments in programmable, highly par-allel Graphics Processing Units (GPUs) have enabled ...
The effectiveness of Machine Learning (ML) methods depend on access to large suitable datasets. In t...
We present the design and implementation of GLDA, a library that utilizes the GPU (Graphics Processi...
Disk-based algorithms have the ability to process large-scale data which do not fit into the memory,...
The recent emergence of Graphics Processing Units (GPUs) as general-purpose parallel computing devic...
Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using S...
The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to its uniq...
When building large-scale machine learning (ML) programs, such as big topic models or deep neural ne...
The recent dramatic progress in machine learning is partially attributed to the availability of high...
We describe distributed algorithms for two widely-used topic models, namely the Latent Dirichlet All...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
Abstract. One of the major research trends currently is the evolution of heterogeneous parallel comp...
Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent...
We present LDA*, a system that has been deployed in one of the largest Internet companies to fulfil ...
Statistical topic models such as the Latent Dirichlet Allocation (LDA) have emerged as an attractive...
Recent developments in programmable, highly par-allel Graphics Processing Units (GPUs) have enabled ...
The effectiveness of Machine Learning (ML) methods depend on access to large suitable datasets. In t...