We consider a partially linear framework for modeling massive heterogeneous data. The major goal is to extract common features across all subpopulations while exploring heterogeneity of each subpopulation. In particular, we propose an aggregation type estimator for the commonality parameter that possesses the (nonasymptotic) minimax optimal bound and asymptotic distribution as if there were no heterogeneity. This oracle result holds when the number of subpopulations does not grow too fast. A plug-in estimator for the heterogeneity parameter is further constructed, and shown to possess the asymptotic distribution as if the commonality information were available. We also test the heterogeneity among a large number of subpopulations. All the a...
This dissertation consists three chapters with a central theme on unobserved heterogeneity in econom...
This paper studies the estimation of a panel data model with latent structures where individuals can...
Generalized linear mixed models are frequently applied to data with clustered categorical outcomes. ...
In this paper, we study the large-scale inference for a linear expectile regression model. To mitiga...
If there are extraordinarily large data, too large to fit into a single computer or too expensive to...
This paper describes a class of heteroscedastic generalized linear regression models in which a subs...
AbstractWe propose a modeling strategy for structured populations, in which individuals are not nece...
We live in an age of big data. Analyzing modern data sets can be very difficult because they usually...
Heterogeneity is often natural in many contemporary applications involving massive data. While posin...
We develop a frequentist method of simultaneous small area estimation under hierarchical models. The...
The computation of the maximum likelihood (ML) estimator for heteroscedastic regression models is co...
This paper provides methods for flexibly capturing unobservable heterogeneity from longitudinal data...
In classical model fitting techinques, such as traditional Multiple Linear Regression models (MLR) ...
This thesis develops statistical machine learning methodology for three distinct tasks. Each method ...
The linear regression model is widely used in empirical work in economics, statistics, and many othe...
This dissertation consists three chapters with a central theme on unobserved heterogeneity in econom...
This paper studies the estimation of a panel data model with latent structures where individuals can...
Generalized linear mixed models are frequently applied to data with clustered categorical outcomes. ...
In this paper, we study the large-scale inference for a linear expectile regression model. To mitiga...
If there are extraordinarily large data, too large to fit into a single computer or too expensive to...
This paper describes a class of heteroscedastic generalized linear regression models in which a subs...
AbstractWe propose a modeling strategy for structured populations, in which individuals are not nece...
We live in an age of big data. Analyzing modern data sets can be very difficult because they usually...
Heterogeneity is often natural in many contemporary applications involving massive data. While posin...
We develop a frequentist method of simultaneous small area estimation under hierarchical models. The...
The computation of the maximum likelihood (ML) estimator for heteroscedastic regression models is co...
This paper provides methods for flexibly capturing unobservable heterogeneity from longitudinal data...
In classical model fitting techinques, such as traditional Multiple Linear Regression models (MLR) ...
This thesis develops statistical machine learning methodology for three distinct tasks. Each method ...
The linear regression model is widely used in empirical work in economics, statistics, and many othe...
This dissertation consists three chapters with a central theme on unobserved heterogeneity in econom...
This paper studies the estimation of a panel data model with latent structures where individuals can...
Generalized linear mixed models are frequently applied to data with clustered categorical outcomes. ...