Accelerating topic model training on a single machine

Lu, M.
Bai, G.
Luo, Q.
Tang, J.
Zhao, J.

Open link

Publication date

January 2013

DOI

10.1007/978-3-642-37401-2_20

ISSN

0302-9743

Journal

0302-9743

Citation count (estimate)

Abstract

We present the design and implementation of GLDA, a library that utilizes the GPU (Graphics Processing Unit) to perform Gibbs sampling of Latent Dirichlet Allocation (LDA) on a single machine. LDA is an effective topic model used in many applications, e.g., classification, feature selection, and information retrieval. However, training an LDA model on large data sets takes hours, even days, due to the heavy computation and intensive memory access. Therefore, we explore the use of the GPU to accelerate LDA training on a single machine. Specifically, we propose three memory-efficient techniques to handle large data sets on the GPU: (1) generating document-topic counts as needed instead of storing all of them, (2) adopting a compact storage sc...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Accelerating topic model training on a single machine

Abstract

Extracted data

Accelerating topic model training on a single machine

Abstract

Extracted data

Related items

Related items