We study the problem of selecting K arms with the highest expected rewards in a stochastic n-armed bandit game. Instead of using existing evaluation metrics (e.g., misidentification probability (Bubeck et al., 2013) or the metric in EXPLORE-K (Kalyanakrishnan & Stone, 2010)), we propose to use the aggregate regret, which is defined as the gap between the average reward of the optimal solution and that of our solution. Besides being a natural metric by itself, we argue that in many applications, such as our motivating example from crowdsourcing, the aggregate regret bound is more suitable. We propose a new PAC algorithm, which, with probability at least 1 − δ, identifies a set of K arms with regret at most . We provide the sample complex...
We study the problem of identifying the best arm in each of the bandits in a multi-bandit multi-arme...
We study the problem of identifying the best arm in each of the bandits in a multi-bandit multi-arme...
We consider the top-k arm identification problem for multi-armed bandits with rewards belonging to a...
We study the problem of selecting K arms with the highest expected rewards in a stochastic N-armed b...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
We propose a PAC formulation for identifying an arm in an n-armed bandit whose mean is within a fixe...
We consider the problem of selecting, from among the arms of a stochastic n-armed bandit, a subset o...
Abstract We consider the Max K-Armed Bandit problem, where a learning agent is faced with several st...
We consider multi-armed bandit problems where the number of arms is larger than the possible number ...
Regret minimisation in stochastic multi-armed bandits is a well-studied problem, for which several o...
International audienceWe consider multi-armed bandit problems where the number of arms is larger tha...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
We consider a stochastic bandit problem with a possibly infinite number of arms. We write p∗ for the...
Abstract We define a general framework for a large class of combinatorial multi-armed bandit (CMAB) ...
We study the problem of identifying the best arm in each of the bandits in a multi-bandit multi-arme...
We study the problem of identifying the best arm in each of the bandits in a multi-bandit multi-arme...
We consider the top-k arm identification problem for multi-armed bandits with rewards belonging to a...
We study the problem of selecting K arms with the highest expected rewards in a stochastic N-armed b...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
We propose a PAC formulation for identifying an arm in an n-armed bandit whose mean is within a fixe...
We consider the problem of selecting, from among the arms of a stochastic n-armed bandit, a subset o...
Abstract We consider the Max K-Armed Bandit problem, where a learning agent is faced with several st...
We consider multi-armed bandit problems where the number of arms is larger than the possible number ...
Regret minimisation in stochastic multi-armed bandits is a well-studied problem, for which several o...
International audienceWe consider multi-armed bandit problems where the number of arms is larger tha...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
We consider a stochastic bandit problem with a possibly infinite number of arms. We write p∗ for the...
Abstract We define a general framework for a large class of combinatorial multi-armed bandit (CMAB) ...
We study the problem of identifying the best arm in each of the bandits in a multi-bandit multi-arme...
We study the problem of identifying the best arm in each of the bandits in a multi-bandit multi-arme...
We consider the top-k arm identification problem for multi-armed bandits with rewards belonging to a...