The linear regression model remains an important workhorse for data scientists. However, many data sets contain many more predictors than observations. Besides, outliers, or anomalies, frequently occur. This paper proposes an algorithm for regression analysis that addresses these features typical for big data sets, which we call “sparse shooting S”. The resulting regression coefficients are sparse, meaning that many of them are set to zero, hereby selecting the most relevant predictors. A distinct feature of the method is its robustness with respect to outliers in the cells of the data matrix. The excellent performance of this robust variable selection and prediction method is shown in a simulation study. A real data application on car fuel...
Due to the increasing availability of data sets with a large number of variables, sparse model estim...
We consider the problem of selecting a parsimonious subset of explanatory variables from a potential...
We consider the problem of selecting a parsimonious subset of explanatory variables from a potential...
The linear regression model remains an important workhorse for data scientists. However, many data s...
The linear regression model remains an important workhorse for data scientists. However, many data s...
The linear regression model remains an important workhorse for data scientists. However, many data s...
A challenging problem in a linear regression model is to select a parsimonious model which is robust...
In multiple regression analysis, a response variable is predicted based on a set of many predictor v...
Outliers in the data are a common problem in applied statistics. Estimators that give reliable resul...
Sparse model estimation is a topic of high importance in modern data analysis due to the increasing ...
Sparse model estimation is a topic of high importance in modern data analysis due to the increasing ...
We propose a procedure for computing a fast approximation to regression estimates based on the minim...
High-dimensional data analysis has become an indispensable part of modern statistics. Due to technol...
Standard statistical techniques such as least squares regression are very accurate if the underlying...
Linear regression models are commonly used statistical models for predicting a response from a set o...
Due to the increasing availability of data sets with a large number of variables, sparse model estim...
We consider the problem of selecting a parsimonious subset of explanatory variables from a potential...
We consider the problem of selecting a parsimonious subset of explanatory variables from a potential...
The linear regression model remains an important workhorse for data scientists. However, many data s...
The linear regression model remains an important workhorse for data scientists. However, many data s...
The linear regression model remains an important workhorse for data scientists. However, many data s...
A challenging problem in a linear regression model is to select a parsimonious model which is robust...
In multiple regression analysis, a response variable is predicted based on a set of many predictor v...
Outliers in the data are a common problem in applied statistics. Estimators that give reliable resul...
Sparse model estimation is a topic of high importance in modern data analysis due to the increasing ...
Sparse model estimation is a topic of high importance in modern data analysis due to the increasing ...
We propose a procedure for computing a fast approximation to regression estimates based on the minim...
High-dimensional data analysis has become an indispensable part of modern statistics. Due to technol...
Standard statistical techniques such as least squares regression are very accurate if the underlying...
Linear regression models are commonly used statistical models for predicting a response from a set o...
Due to the increasing availability of data sets with a large number of variables, sparse model estim...
We consider the problem of selecting a parsimonious subset of explanatory variables from a potential...
We consider the problem of selecting a parsimonious subset of explanatory variables from a potential...