In this paper we present an approach for scaling up Bayesian learning using variational methods by exploiting distributed computing clusters managed by modern big data processing tools like Apache Spark or Apache Flink, which e ciently support iterative map-reduce operations. Our approach is de ned as a distributed projected natural gradient ascent algorithm, has excellent convergence properties, and covers a wide range of conjugate exponential family models. We evaluate the proposed algorithm on three real-world datasets from di erent domains (the Pubmed abstracts dataset, a GPS trajectory dataset, and a nancial dataset) and using several models (LDA, factor analysis, mixture of Gaussians and linear regression models). Our approach compare...
There has been considerable recent interest in Bayesian modeling of high-dimensional networks via la...
This work presents a decentralized, approx-imate method for performing variational in-ference on a n...
Variational inference is one of the tools that now lies at the heart of the modern data analysis lif...
This paper makes two contributions to Bayesian machine learning algorithms. Firstly, we propose stoc...
The Bayesian framework for machine learning allows for the incorporation of prior knowledge in a coh...
The availability of massive computational resources has led to a wide-spread application and develop...
Abstract—Explosive growth in data and availability of cheap computing resources have sparked increas...
Gaussian processes (GPs) are a powerful tool for probabilistic inference over functions. They have b...
Multilevel clustering problems where the con-tent and contextual information are jointly clustered a...
While modern machine learning and deep learning seem to dominate the areas where scalability and mod...
This paper presents a methodology for creating streaming, distributed inference algorithms for Bayes...
A current challenge for data management systems is to support the construction and maintenance of ma...
We present a new parallel algorithm for learning Bayesian inference networks from data. Our learning...
This thesis aims to scale Bayesian machine learning (ML) to very large datasets. First, I propose a ...
Appropriate tools for managing large-scale data, like online texts, images and user pro-files, are b...
There has been considerable recent interest in Bayesian modeling of high-dimensional networks via la...
This work presents a decentralized, approx-imate method for performing variational in-ference on a n...
Variational inference is one of the tools that now lies at the heart of the modern data analysis lif...
This paper makes two contributions to Bayesian machine learning algorithms. Firstly, we propose stoc...
The Bayesian framework for machine learning allows for the incorporation of prior knowledge in a coh...
The availability of massive computational resources has led to a wide-spread application and develop...
Abstract—Explosive growth in data and availability of cheap computing resources have sparked increas...
Gaussian processes (GPs) are a powerful tool for probabilistic inference over functions. They have b...
Multilevel clustering problems where the con-tent and contextual information are jointly clustered a...
While modern machine learning and deep learning seem to dominate the areas where scalability and mod...
This paper presents a methodology for creating streaming, distributed inference algorithms for Bayes...
A current challenge for data management systems is to support the construction and maintenance of ma...
We present a new parallel algorithm for learning Bayesian inference networks from data. Our learning...
This thesis aims to scale Bayesian machine learning (ML) to very large datasets. First, I propose a ...
Appropriate tools for managing large-scale data, like online texts, images and user pro-files, are b...
There has been considerable recent interest in Bayesian modeling of high-dimensional networks via la...
This work presents a decentralized, approx-imate method for performing variational in-ference on a n...
Variational inference is one of the tools that now lies at the heart of the modern data analysis lif...