Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, February, 2020Cataloged from the official PDF version of thesis.Includes bibliographical references (pages 89-93).Data cleaning is naturally framed as probabilistic inference in a generative model, combining a prior distribution over ground-truth databases with a likelihood that models the noisy channel by which the data are filtered, corrupted, and joined to yield incomplete, dirty, and denormalized datasets. Based on this view, this thesis presents PClean, a unified generative modeling architecture for cleaning and normalizing dirty data in diverse domains. Given an unclean dataset and a probabilistic program encoding...
Data Cleaning, despite being a long standing problem, has occupied the center stage again thanks to ...
We consider the problem of Bayesian inference in the family of probabilistic models implicitly defin...
Probabilistic models used in quantitative sciences have historically co-evolved with methods for per...
Data cleaning is naturally framed as probabilistic inference in a generative model of ground-truth d...
Data Cleaning is a long standing problem, which is grow-ing in importance with the mass of uncurated...
There is a considerable body of work on data cleaning which employs various principles to rectify er...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
Although probabilistic modeling and Bayesian inference provide a unifying theoretical framework for ...
Probabilistic modeling lets us infer, predict and make decisions based on incomplete or noisy data. ...
Probabilistic modeling and reasoning are central tasks in artificial intelligence and machine learni...
This thesis aims to scale Bayesian machine learning (ML) to very large datasets. First, I propose a ...
Probabilistic reasoning, among methodologies used within the domain of artificial intelligence, is r...
Imagine a world where computational simulations can be inverted as easily as running them forwards, ...
Digitally collected data su\ud ↵\ud ers from many data quality issues, such as duplicate, incorrect,...
Data Cleaning, despite being a long standing problem, has occupied the center stage again thanks to ...
We consider the problem of Bayesian inference in the family of probabilistic models implicitly defin...
Probabilistic models used in quantitative sciences have historically co-evolved with methods for per...
Data cleaning is naturally framed as probabilistic inference in a generative model of ground-truth d...
Data Cleaning is a long standing problem, which is grow-ing in importance with the mass of uncurated...
There is a considerable body of work on data cleaning which employs various principles to rectify er...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
Although probabilistic modeling and Bayesian inference provide a unifying theoretical framework for ...
Probabilistic modeling lets us infer, predict and make decisions based on incomplete or noisy data. ...
Probabilistic modeling and reasoning are central tasks in artificial intelligence and machine learni...
This thesis aims to scale Bayesian machine learning (ML) to very large datasets. First, I propose a ...
Probabilistic reasoning, among methodologies used within the domain of artificial intelligence, is r...
Imagine a world where computational simulations can be inverted as easily as running them forwards, ...
Digitally collected data su\ud ↵\ud ers from many data quality issues, such as duplicate, incorrect,...
Data Cleaning, despite being a long standing problem, has occupied the center stage again thanks to ...
We consider the problem of Bayesian inference in the family of probabilistic models implicitly defin...
Probabilistic models used in quantitative sciences have historically co-evolved with methods for per...