High-dimensional variable selection is a challenging task, especially when groups of highly correlated variables are present in the data, such as in genomics research, direction-of-arrival estimation, and financial engineering. Recently, the T-Knock filter, a new framework for fast variable selection in high-dimensional settings has been developed. It provably controls the false discovery rate (FDR) at a given target level. However, its current version does not consider groups of highly correlated variables, which can lead to a loss in the true positive rate (TPR), i.e., the power. Hence, we propose the T-Knock+GVS filter that allows for grouped variable selection with FDR control in such settings. This is achieved by modifying the forward ...
The discovery of biomarkers that are informative for cancer risk assessment, diagnosis, prognosis an...
Background. The iterative sure independence screening (ISIS) is a popular method in selecting import...
Background Modern biotechnologies often result in high-dimensional data sets with many more varia...
We propose the Terminating-Knockoff (T-Knock) filter, a fast variable selection method for high-dime...
In many fields of science, we observe a response variable together with a large number of potential ...
A genome-wide association study (GWAS) aims to determine genetic variants statistically associated w...
In many applications, we need to study a linear regression model that consists of a response variabl...
Stability Selection, which combines penalized regression with subsampling, is a promising algorithm ...
Controlled variable selection is an important analytical step in various scientific fields, such as ...
The generalized linear model (GLM) has been widely used in practice to model counts or other types o...
Given the costliness of HIV drug therapy research, it is important not only to maximize true positiv...
In many scientific and medical settings, large-scale experiments are generating large quantities of ...
Large-scale genetic association studies are increasingly utilized for identifying novel susceptible ...
We consider the variable selection problem, which seeks to identify important variables influencin...
Abstract Background When many (up to millions) of statistical tests are conducted in discovery set a...
The discovery of biomarkers that are informative for cancer risk assessment, diagnosis, prognosis an...
Background. The iterative sure independence screening (ISIS) is a popular method in selecting import...
Background Modern biotechnologies often result in high-dimensional data sets with many more varia...
We propose the Terminating-Knockoff (T-Knock) filter, a fast variable selection method for high-dime...
In many fields of science, we observe a response variable together with a large number of potential ...
A genome-wide association study (GWAS) aims to determine genetic variants statistically associated w...
In many applications, we need to study a linear regression model that consists of a response variabl...
Stability Selection, which combines penalized regression with subsampling, is a promising algorithm ...
Controlled variable selection is an important analytical step in various scientific fields, such as ...
The generalized linear model (GLM) has been widely used in practice to model counts or other types o...
Given the costliness of HIV drug therapy research, it is important not only to maximize true positiv...
In many scientific and medical settings, large-scale experiments are generating large quantities of ...
Large-scale genetic association studies are increasingly utilized for identifying novel susceptible ...
We consider the variable selection problem, which seeks to identify important variables influencin...
Abstract Background When many (up to millions) of statistical tests are conducted in discovery set a...
The discovery of biomarkers that are informative for cancer risk assessment, diagnosis, prognosis an...
Background. The iterative sure independence screening (ISIS) is a popular method in selecting import...
Background Modern biotechnologies often result in high-dimensional data sets with many more varia...