As machine learning gains significant attention in many disciplines and research communities, the variety of data structures has increased, with examples including distributions and sets of observations. In this thesis, we consider sets and distributions as inputs for machine learning problems. In particular, we propose non-parametric tests, supervised learning, semi-supervised learning and metalearning methodologies on these objects. In each case, with careful consideration of the input structure, we construct models that are applicable to various real life tasks. We begin by considering the problem of weakly supervised learning on aggregate outputs, where the labels are only available at a much coarser resolution than the level o...
This thesis explores one of the most fundamental questions in Machine Learning, namely, how should t...
Most machine learning algorithms, such as classification or regression, treat the individual data po...
This thesis develops statistical machine learning methodology for three distinct tasks. Each method ...
As machine learning gains significant attention in many disciplines and research communities, the va...
Machine learning has made incredible advances in the last couple of decades. Notwithstanding,a lot o...
Abstract. We propose a novel approach for the estimation of the size of training sets that are neede...
In this dissertation, we explore two fundamental sets of inference problems arising in machine learn...
Robustness and generalizability of supervised learning algorithms depend on the quality of the label...
Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kern...
In machine learning, we traditionally evaluate the performance of a single model, averaged over a co...
This thesis concerns the development and mathematical analysis of statistical procedures for classi...
A common use case of machine learning in real world settings is to learn a model from historical dat...
Many simulation data sets are so massive that they must be distributed among disk farms attached to ...
Representing high-order interactions in data often results in large models with an intractable numbe...
The focus of this dissertation is on extending targeted learning to settings with complex unknown de...
This thesis explores one of the most fundamental questions in Machine Learning, namely, how should t...
Most machine learning algorithms, such as classification or regression, treat the individual data po...
This thesis develops statistical machine learning methodology for three distinct tasks. Each method ...
As machine learning gains significant attention in many disciplines and research communities, the va...
Machine learning has made incredible advances in the last couple of decades. Notwithstanding,a lot o...
Abstract. We propose a novel approach for the estimation of the size of training sets that are neede...
In this dissertation, we explore two fundamental sets of inference problems arising in machine learn...
Robustness and generalizability of supervised learning algorithms depend on the quality of the label...
Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kern...
In machine learning, we traditionally evaluate the performance of a single model, averaged over a co...
This thesis concerns the development and mathematical analysis of statistical procedures for classi...
A common use case of machine learning in real world settings is to learn a model from historical dat...
Many simulation data sets are so massive that they must be distributed among disk farms attached to ...
Representing high-order interactions in data often results in large models with an intractable numbe...
The focus of this dissertation is on extending targeted learning to settings with complex unknown de...
This thesis explores one of the most fundamental questions in Machine Learning, namely, how should t...
Most machine learning algorithms, such as classification or regression, treat the individual data po...
This thesis develops statistical machine learning methodology for three distinct tasks. Each method ...