Different allocation strategies can be found in the literature to deal with the multi-armed bandit problem under a frequentist view or from a Bayesian perspective. In this paper, we propose a novel allocation strategy, the possibilistic reward method. First, possibilistic reward distributions are used to model the uncertainty about the arm expected rewards, which are then converted into probability distributions using a pignistic probability transformation. Finally, a simulation experiment is carried out to find out the one with the highest expected reward, which is then pulled. A parametric probability transformation of the proposed is then introduced together with a dynamic optimization, which implies that neither previous knowledge nor a...
We propose a theoretical and computational framework for approximating the optimal policy in multi-a...
Suppose two treatments with binary responses are available for patients with some disease. Sequentia...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
Different allocation strategies can be found in the literature to deal with the multi-armed bandit p...
In this paper, we propose a set of allocation strategies to deal with the multi-armed bandit problem...
We study a multi-armed bandit problem in a setting where covariates are available. We take a nonpara...
In a multi-armed bandit (MAB) problem a gambler needs to choose at each round of play one of K arms,...
Abstract—We present a formal model of human decision-making in explore-exploit tasks using the conte...
We consider a multiarmed bandit problem where the expected reward of each arm is a linear function o...
A multi-armed bandit is the simplest problem to study learning under uncertainty when decisions affe...
Multi-armed bandit, a popular framework for sequential decision-making problems, has recently gained...
Published version of an article from Lecture Notes in Computer Science. Also available at SpringerLi...
In this paper we investigate the multi-armed bandit problem, where each arm generates an infinite se...
In this paper we investigate the multi-armed bandit problem, where each arm generates an infinite se...
The two-armed bandit problem is a classical optimization problem where a player sequentially selects...
We propose a theoretical and computational framework for approximating the optimal policy in multi-a...
Suppose two treatments with binary responses are available for patients with some disease. Sequentia...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
Different allocation strategies can be found in the literature to deal with the multi-armed bandit p...
In this paper, we propose a set of allocation strategies to deal with the multi-armed bandit problem...
We study a multi-armed bandit problem in a setting where covariates are available. We take a nonpara...
In a multi-armed bandit (MAB) problem a gambler needs to choose at each round of play one of K arms,...
Abstract—We present a formal model of human decision-making in explore-exploit tasks using the conte...
We consider a multiarmed bandit problem where the expected reward of each arm is a linear function o...
A multi-armed bandit is the simplest problem to study learning under uncertainty when decisions affe...
Multi-armed bandit, a popular framework for sequential decision-making problems, has recently gained...
Published version of an article from Lecture Notes in Computer Science. Also available at SpringerLi...
In this paper we investigate the multi-armed bandit problem, where each arm generates an infinite se...
In this paper we investigate the multi-armed bandit problem, where each arm generates an infinite se...
The two-armed bandit problem is a classical optimization problem where a player sequentially selects...
We propose a theoretical and computational framework for approximating the optimal policy in multi-a...
Suppose two treatments with binary responses are available for patients with some disease. Sequentia...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...