This paper studies hypothesis testing and parameter estimation in the context of the divide-and-conquer algorithm. In a unified likelihood-based framework, we propose new test statistics and point estimators obtained by aggregating various statistics from k subsamples of size n/k, where n is the sample size. In both low dimensional and sparse high dimensional settings, we address the important question of how large k can be, as n grows large, such that the loss of efficiency due to the divide-and-conquer algorithm is negligible. In other words, the resulting estimators have the same inferential efficiencies and estimation rates as an oracle with access to the full sample. Thorough numerical results are provided to back up the theory
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
International audienceWe propose two new procedures based on multiple hypothesis testing for correct...
We study the distributed machine learning problem where the n feature-response pairs are partitioned...
This paper studies hypothesis testing and parameter estimation in the context of the divide and conq...
This dissertation consists of three research papers that deal with three different problems in stati...
We derive minimax testing errors in a distributed framework where the data is split over multiple ma...
This thesis considers estimation and statistical inference for high dimensional model with constrain...
This book generalizes and extends the available theory in robust and decentralized hypothesis testin...
This thesis presents three projects, including adaptive estimation in high-dimensional additive mode...
Fitting statistical models is computationally challenging when the sample size or the dimension of t...
For high dimensional statistical models, researchers have begun to fo-cus on situations which can be...
This paper is about two related decision theoretic problems, nonparametric two-sample testing and in...
We propose two new procedures based on multiple hypothesis testing for correct support estimation in...
High-dimensional statistical inference deals with models in which the number of parameters $p$ is co...
Motivated by diverse applications in ecology, genetics, and language modeling, researchers in learni...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
International audienceWe propose two new procedures based on multiple hypothesis testing for correct...
We study the distributed machine learning problem where the n feature-response pairs are partitioned...
This paper studies hypothesis testing and parameter estimation in the context of the divide and conq...
This dissertation consists of three research papers that deal with three different problems in stati...
We derive minimax testing errors in a distributed framework where the data is split over multiple ma...
This thesis considers estimation and statistical inference for high dimensional model with constrain...
This book generalizes and extends the available theory in robust and decentralized hypothesis testin...
This thesis presents three projects, including adaptive estimation in high-dimensional additive mode...
Fitting statistical models is computationally challenging when the sample size or the dimension of t...
For high dimensional statistical models, researchers have begun to fo-cus on situations which can be...
This paper is about two related decision theoretic problems, nonparametric two-sample testing and in...
We propose two new procedures based on multiple hypothesis testing for correct support estimation in...
High-dimensional statistical inference deals with models in which the number of parameters $p$ is co...
Motivated by diverse applications in ecology, genetics, and language modeling, researchers in learni...
In this dissertation, we make progress on certain algorithmic problems broadly over two computationa...
International audienceWe propose two new procedures based on multiple hypothesis testing for correct...
We study the distributed machine learning problem where the n feature-response pairs are partitioned...