Combinatorial stochastic semi-bandits appear naturally in many contexts where the exploration/exploitation dilemma arises, such as web content optimization (recommendation/online advertising) or shortest path routing methods. This problem is formulated as follows: an agent sequentially optimizes an unknown and noisy objective function, defined on a power set $\mathcal{P}([n])$. For each set $A$ tried out, the agent suffers a loss equal to the expected deviation from the optimal solution while obtaining observations to reduce its uncertainty on the coordinates from $A$. Our objective is to study the efficiency of policies for this problem, focusing in particular on the following two aspects: statistical efficiency, where the criterion consid...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...
This thesis studies several extensions of multi-armed bandit problem, where a learner sequentially s...
International audienceMost work on sequential learning assumes a fixed set of actions that are avail...
Combinatorial stochastic semi-bandits appear naturally in many contexts where the exploration/exploi...
Sequential decision making is a core component of many real-world applications, from computer-networ...
International audienceWe improve the efficiency of algorithms for stochastic combinatorial semi-band...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
This thesis concerns model-based methods to solve reinforcement learning problems: these methods def...
This thesis is dedicated to the study of resource allocation problems in uncertain environments, whe...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
In this thesis, we first focus on two stochastic bandit problems. The first problem deals with the f...
This document presents in a unified way different results about the optimal solution of several mult...
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
A Multi-Armed Bandits (MAB) is a learning problem where an agent sequentially chooses an action amon...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...
This thesis studies several extensions of multi-armed bandit problem, where a learner sequentially s...
International audienceMost work on sequential learning assumes a fixed set of actions that are avail...
Combinatorial stochastic semi-bandits appear naturally in many contexts where the exploration/exploi...
Sequential decision making is a core component of many real-world applications, from computer-networ...
International audienceWe improve the efficiency of algorithms for stochastic combinatorial semi-band...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
This thesis concerns model-based methods to solve reinforcement learning problems: these methods def...
This thesis is dedicated to the study of resource allocation problems in uncertain environments, whe...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
In this thesis, we first focus on two stochastic bandit problems. The first problem deals with the f...
This document presents in a unified way different results about the optimal solution of several mult...
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
A Multi-Armed Bandits (MAB) is a learning problem where an agent sequentially chooses an action amon...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...
This thesis studies several extensions of multi-armed bandit problem, where a learner sequentially s...
International audienceMost work on sequential learning assumes a fixed set of actions that are avail...