Data Mining is characterised by its ability at processing large amounts of data. Among those are the data ”features”- variables or association rules that can be derived from them. Selecting the most interesting features is a classical data mining problem. That selection requires a large number of tests from which arise a number of false discoveries. An original non parametric control method is proposed in this paper. A new criterion, UAFWER, defined as the risk of exceeding a pre-set number of false discoveries, is controlled by BS FD, a bootstrap based algorithm that can be used on one- or two-sided problems. The usefulness of the procedure is illustrated by the selection of differentially interesting association rules on genetic data
High-dimensional hypothesis testing is ubiquitous in the biomedical sciences, and informative covari...
Case-control studies of genetic polymorphisms and gene-environment interactions are reporting large ...
International audienceHow to weigh the Benjamini-Hochberg procedure? In the context of multiple hypo...
Data Mining is characterised by its ability at processing large amounts of data. Among those are the...
Abstract Background Procedures for controlling the false discovery rate (FDR) are widely applied as ...
Background: When many (up to millions) of statistical tests are conducted in discovery set analyses ...
In this thesis, general theoretical tools are constructed which can be applied to develop ma- chine ...
International audienceThe false discovery proportion (FDP) is a convenient way to account for false ...
In light of the vast amounts of genomic data that are now being generated, we propose a new measure,...
In modern applications of high-throughput technologies, it is important to identify pairwise associa...
The search for interesting Boolean association rules is an important topic in knowledge discovery in...
Association rule mining is an important problem in the data mining area. It enumerates and tests a l...
Background: In high-throughput studies, hundreds to millions of hypotheses are typically tested. Sta...
This article extends false discovery rates to random fields. for which there are uncountably many hy...
Stability Selection, which combines penalized regression with subsampling, is a promising algorithm ...
High-dimensional hypothesis testing is ubiquitous in the biomedical sciences, and informative covari...
Case-control studies of genetic polymorphisms and gene-environment interactions are reporting large ...
International audienceHow to weigh the Benjamini-Hochberg procedure? In the context of multiple hypo...
Data Mining is characterised by its ability at processing large amounts of data. Among those are the...
Abstract Background Procedures for controlling the false discovery rate (FDR) are widely applied as ...
Background: When many (up to millions) of statistical tests are conducted in discovery set analyses ...
In this thesis, general theoretical tools are constructed which can be applied to develop ma- chine ...
International audienceThe false discovery proportion (FDP) is a convenient way to account for false ...
In light of the vast amounts of genomic data that are now being generated, we propose a new measure,...
In modern applications of high-throughput technologies, it is important to identify pairwise associa...
The search for interesting Boolean association rules is an important topic in knowledge discovery in...
Association rule mining is an important problem in the data mining area. It enumerates and tests a l...
Background: In high-throughput studies, hundreds to millions of hypotheses are typically tested. Sta...
This article extends false discovery rates to random fields. for which there are uncountably many hy...
Stability Selection, which combines penalized regression with subsampling, is a promising algorithm ...
High-dimensional hypothesis testing is ubiquitous in the biomedical sciences, and informative covari...
Case-control studies of genetic polymorphisms and gene-environment interactions are reporting large ...
International audienceHow to weigh the Benjamini-Hochberg procedure? In the context of multiple hypo...