Most work on sequential learning assumes a fixed set of actions that are available all the time. However, in practice, actions can consist of picking subsets of read-ings from sensors that may break from time to time, road segments that can be blocked or goods that are out of stock. In this paper we study learning algorithms that are able to deal with stochastic availability of such unreliable composite ac-tions. We propose and analyze algorithms based on the Follow-The-Perturbed-Leader prediction method for several learning settings differing in the feedback provided to the learner. Our algorithms rely on a novel loss estimation technique that we call Counting Asleep Times. We deliver regret bounds for our algorithms for the previously stu...
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the...
In this paper the sequential prediction problem with expert ad-vice is considered for the case where...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...
International audienceMost work on sequential learning assumes a fixed set of actions that are avail...
The greedy algorithm is extensively studied in the field of combinatorial optimiza-tion for decades....
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
2012-11-26The formulations and theories of multi-armed bandit (MAB) problems provide fundamental too...
International audienceWe consider the problem of asynchronous online combinatorial optimization on a...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
We study how to adapt to smoothly-varying (‘easy’) environments in well-known online learning proble...
Multi-Armed Bandits (MAB) constitute the most fundamental model for sequential decision making probl...
We address online linear optimization problems when the possible actions of the decision maker are r...
In this paper we consider online learning in fi-nite Markov decision processes (MDPs) with changing ...
This thesis studies two online learning problems in which the efficiency of the proposed strategies ...
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the...
In this paper the sequential prediction problem with expert ad-vice is considered for the case where...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...
International audienceMost work on sequential learning assumes a fixed set of actions that are avail...
The greedy algorithm is extensively studied in the field of combinatorial optimiza-tion for decades....
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
2012-11-26The formulations and theories of multi-armed bandit (MAB) problems provide fundamental too...
International audienceWe consider the problem of asynchronous online combinatorial optimization on a...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
We study how to adapt to smoothly-varying (‘easy’) environments in well-known online learning proble...
Multi-Armed Bandits (MAB) constitute the most fundamental model for sequential decision making probl...
We address online linear optimization problems when the possible actions of the decision maker are r...
In this paper we consider online learning in fi-nite Markov decision processes (MDPs) with changing ...
This thesis studies two online learning problems in which the efficiency of the proposed strategies ...
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the...
In this paper the sequential prediction problem with expert ad-vice is considered for the case where...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...