Topic models such as Latent Dirichlet Allocation (LDA) have been widely used in information retrieval for tasks ranging from smoothing and feedback methods to tools for exploratory search and discovery. However, classical methods for inferring topic models do not scale up to the massive size of today's publicly available Web-scale data sets. The state-of-the-art approaches rely on custom strategies, implementations and hardware to facilitate their asynchronous, communication-intensive workloads. We present APS-LDA, which integrates state-of-the-art topic modeling with cluster computing frameworks such as Spark using a novel asynchronous parameter server. Advantages of this integration include convenient usage of existing data processing pip...
When building large-scale machine learning (ML) programs, such as big topic models or deep neural ne...
Latent topic analysis has emerged as one of the most effective methods for classifying, clustering a...
Today we are living in modern Internet era. We can get all our information from the internet anytime...
In this paper, I apply latent dirichlet allocation(LDA) to cluster 100,000 health related articles u...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requ...
We present LDA*, a system that has been deployed in one of the largest Internet companies to fulfil ...
Topic modeling algorithms (like Latent Dirichlet Allocation) tend to be very slow when run over larg...
Learning meaningful topic models with massive document collections which contain millions of documen...
Search algorithms incorporating some form of topic model have a long history in information retrieva...
Topics discovered by the latent Dirichlet allocation (LDA) method are sometimes not meaningful for h...
Abstract Background: Unstructured and textual data is increasing rapidly and Latent Dirichlet Alloca...
Abstract Web API is a popular way to organize network services in cloud computing environment. Howev...
We describe the methodology that we followed to automatically extract topics corresponding to known ...
We describe distributed algorithms for two widely-used topic models, namely the Latent Dirichlet All...
When building large-scale machine learning (ML) programs, such as big topic models or deep neural ne...
Latent topic analysis has emerged as one of the most effective methods for classifying, clustering a...
Today we are living in modern Internet era. We can get all our information from the internet anytime...
In this paper, I apply latent dirichlet allocation(LDA) to cluster 100,000 health related articles u...
Thesis (Master's)--University of Washington, 2014In their 2001 work Latent Dirichlet Allocation, Ble...
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requ...
We present LDA*, a system that has been deployed in one of the largest Internet companies to fulfil ...
Topic modeling algorithms (like Latent Dirichlet Allocation) tend to be very slow when run over larg...
Learning meaningful topic models with massive document collections which contain millions of documen...
Search algorithms incorporating some form of topic model have a long history in information retrieva...
Topics discovered by the latent Dirichlet allocation (LDA) method are sometimes not meaningful for h...
Abstract Background: Unstructured and textual data is increasing rapidly and Latent Dirichlet Alloca...
Abstract Web API is a popular way to organize network services in cloud computing environment. Howev...
We describe the methodology that we followed to automatically extract topics corresponding to known ...
We describe distributed algorithms for two widely-used topic models, namely the Latent Dirichlet All...
When building large-scale machine learning (ML) programs, such as big topic models or deep neural ne...
Latent topic analysis has emerged as one of the most effective methods for classifying, clustering a...
Today we are living in modern Internet era. We can get all our information from the internet anytime...