42 pagesThis project studies methods of using data subsampling to perform model selection. Most commonly used methods for model selection require training all models on the entire training data several times in order to pick the best one. This is often one of the most computationally expensive aspects of model selection. It would therefore be valuable to understand how resources can be better allocated to pick the best model for a given dataset. This project explores this question of how to optimize resource allocation for model selection by subsampling data. We try three different approaches to model selection starting with (1) a randomized multi-armed bandit approach, (2) subsampling using influence functions and finally (3) a new boostin...
In the last few years, as processing the data became a part of everyday life in different areas of h...
International audienceThe success of machine learning (ML) systems depends on data availability, vol...
Building a deep learning model based on small dataset is difficult, even impossible. Toavoiding over...
Automated machine learning (AutoML) frameworks have become important tools in the data scientists' a...
ArXiv Subjects:Statistics Theory (math.ST)International audienceHyperparameters tuning and model sel...
Fine-tuning from a collection of models pre-trained on different domains (a “model zoo”) is emerging...
186 pagesAutomated machine learning (AutoML) seeks to reduce the human and machine costs of finding ...
Liuliakov A, Hermes L, Hammer B. AutoML technologies for the identification of sparse classification...
We present a new variable selection method based on model-based gradient boosting and randomly permu...
The aim of the paper is to develop hypothesis testing procedures both for variable selection and mod...
Most active learning methods avoid model selection by training models of one type (SVMs, boosted tre...
The bootstrap is a widely used procedure for statistical inference because of its simplicity and att...
The bootstrap is a widely used procedure for statistical inference because of its simplicity and att...
In the time of Big Data, training complex models on large-scale data sets is challenging, making it ...
The great success of deep learning heavily relies on increasingly larger training data, which comes ...
In the last few years, as processing the data became a part of everyday life in different areas of h...
International audienceThe success of machine learning (ML) systems depends on data availability, vol...
Building a deep learning model based on small dataset is difficult, even impossible. Toavoiding over...
Automated machine learning (AutoML) frameworks have become important tools in the data scientists' a...
ArXiv Subjects:Statistics Theory (math.ST)International audienceHyperparameters tuning and model sel...
Fine-tuning from a collection of models pre-trained on different domains (a “model zoo”) is emerging...
186 pagesAutomated machine learning (AutoML) seeks to reduce the human and machine costs of finding ...
Liuliakov A, Hermes L, Hammer B. AutoML technologies for the identification of sparse classification...
We present a new variable selection method based on model-based gradient boosting and randomly permu...
The aim of the paper is to develop hypothesis testing procedures both for variable selection and mod...
Most active learning methods avoid model selection by training models of one type (SVMs, boosted tre...
The bootstrap is a widely used procedure for statistical inference because of its simplicity and att...
The bootstrap is a widely used procedure for statistical inference because of its simplicity and att...
In the time of Big Data, training complex models on large-scale data sets is challenging, making it ...
The great success of deep learning heavily relies on increasingly larger training data, which comes ...
In the last few years, as processing the data became a part of everyday life in different areas of h...
International audienceThe success of machine learning (ML) systems depends on data availability, vol...
Building a deep learning model based on small dataset is difficult, even impossible. Toavoiding over...