This thesis investigates a new method to estimate the system norm using reinforcement learning. Given an unknown system, we aim to estimate its H∞- norm with a model-free approach, which involves solving a sequential input design problem. This problem is modeled as a multi-armed bandit, which provides us a way to study optimal decision making under uncertainty. In the multi-armed bandit framework, there are two different types of policies: index and Bayesian policies. The main goal of this thesis is to compare the performance of these two class of policies. We take Thompson Sampling representing Bayesian policies and five different UCB-type algorithms in the class of index policies. We compare these algorithms in two different setups depend...
International audienceWe fill in a long open gap in the characterization of the minimax rate for the...
Δημοσίευση σε επιστημονικό περιοδικόSummarization: Several researchers have recently investigated th...
In this thesis an attempt is made to find the optimal order execution policy that maximizes the rewa...
This thesis investigates a new method to estimate the system norm using reinforcement learning. Give...
We study the problem of estimating the largest gain of an unknown linear and time-invariant filter, ...
Engineering sciences deal with the problem of optimal design in the face of uncertainty. In particul...
Masteroppgave i informasjons- og kommunikasjonsteknologi 2009 – Universitetet i Agder, GrimstadThe t...
Multi-armed bandits (MABs) have been studied extensively in the literature and have applications in ...
Masteroppgave i informasjons- og kommunikasjonsteknologi 2010 – Universitetet i Agder, GrimstadMulti...
Abstract—We present a formal model of human decision-making in explore-exploit tasks using the conte...
The multi-armed bandit (MAB) problem is a mathematical formulation of the exploration-exploitation t...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
This document presents in a unified way different results about the optimal solution of several mult...
A growing problem in network security stems from the fact that both attack methods and target system...
International audienceIn this paper, we study the problem of estimating the mean values of all the a...
International audienceWe fill in a long open gap in the characterization of the minimax rate for the...
Δημοσίευση σε επιστημονικό περιοδικόSummarization: Several researchers have recently investigated th...
In this thesis an attempt is made to find the optimal order execution policy that maximizes the rewa...
This thesis investigates a new method to estimate the system norm using reinforcement learning. Give...
We study the problem of estimating the largest gain of an unknown linear and time-invariant filter, ...
Engineering sciences deal with the problem of optimal design in the face of uncertainty. In particul...
Masteroppgave i informasjons- og kommunikasjonsteknologi 2009 – Universitetet i Agder, GrimstadThe t...
Multi-armed bandits (MABs) have been studied extensively in the literature and have applications in ...
Masteroppgave i informasjons- og kommunikasjonsteknologi 2010 – Universitetet i Agder, GrimstadMulti...
Abstract—We present a formal model of human decision-making in explore-exploit tasks using the conte...
The multi-armed bandit (MAB) problem is a mathematical formulation of the exploration-exploitation t...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
This document presents in a unified way different results about the optimal solution of several mult...
A growing problem in network security stems from the fact that both attack methods and target system...
International audienceIn this paper, we study the problem of estimating the mean values of all the a...
International audienceWe fill in a long open gap in the characterization of the minimax rate for the...
Δημοσίευση σε επιστημονικό περιοδικόSummarization: Several researchers have recently investigated th...
In this thesis an attempt is made to find the optimal order execution policy that maximizes the rewa...