This thesis aims to scale Bayesian machine learning (ML) to very large datasets. First, I propose a pairwise Gaussian random field model (PGRF) for high dimensional data imputation. The PGRF is a graphical, factor-based model. Besides its high accuracy, the PGRF is more efficient and scalable than the Gaussian Markov random field model (GMRF). Experiments show that the PGRF followed by the linear regression (LR) or support vector machine (SVM) reduces the RMSE by 10% to 45% compared with the mean imputation followed by the LR or SVM. Furthermore, the PGRF scales the imputation to very large datasets distributed in a 100-machine cluster that could not be handled by the GMRF or Gaussian methods at all. Unfortunately, the PGRF model is hard to...
Abstract—The motivation for this paper is to apply Bayesian structure learning using Model Averaging...
University of Minnesota Ph.D. dissertation. April 2020. Major: Computer Science. Advisor: Arindam Ba...
Each of the three chapters included here attempts to meet a different computing challenge that prese...
Anyone working in machine learning requires a particular balance between multiple disciplines. A sol...
Bayesian statistics has emerged as a leading paradigm for the analysis of complicated datasets and f...
Abstract—Explosive growth in data and availability of cheap computing resources have sparked increas...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
In the last decade or so, there has been a dramatic increase in storage facilities and the possibili...
Bayesian networks (BNs) are highly practical and successful tools for modeling probabilistic knowled...
Each of the three chapters included here attempts to meet a different comput-ing challenge that pres...
This thesis is focused on the development of computationally efficient procedures for regression mod...
<p>Many modern applications fall into the category of "large-scale" statistical problems, in which b...
Modern big data analytics often involve large data sets in which the features of interest are measur...
In the big data era, scalability has become a crucial requirement for any useful computational model...
Computational intensity and sequential nature of estimation techniques for Bayesian methods in stati...
Abstract—The motivation for this paper is to apply Bayesian structure learning using Model Averaging...
University of Minnesota Ph.D. dissertation. April 2020. Major: Computer Science. Advisor: Arindam Ba...
Each of the three chapters included here attempts to meet a different computing challenge that prese...
Anyone working in machine learning requires a particular balance between multiple disciplines. A sol...
Bayesian statistics has emerged as a leading paradigm for the analysis of complicated datasets and f...
Abstract—Explosive growth in data and availability of cheap computing resources have sparked increas...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
In the last decade or so, there has been a dramatic increase in storage facilities and the possibili...
Bayesian networks (BNs) are highly practical and successful tools for modeling probabilistic knowled...
Each of the three chapters included here attempts to meet a different comput-ing challenge that pres...
This thesis is focused on the development of computationally efficient procedures for regression mod...
<p>Many modern applications fall into the category of "large-scale" statistical problems, in which b...
Modern big data analytics often involve large data sets in which the features of interest are measur...
In the big data era, scalability has become a crucial requirement for any useful computational model...
Computational intensity and sequential nature of estimation techniques for Bayesian methods in stati...
Abstract—The motivation for this paper is to apply Bayesian structure learning using Model Averaging...
University of Minnesota Ph.D. dissertation. April 2020. Major: Computer Science. Advisor: Arindam Ba...
Each of the three chapters included here attempts to meet a different computing challenge that prese...