This thesis investigates sequential decision making tasks that fall in the framework of reinforcement learning (RL). These tasks involve a decision maker repeatedly interacting with an environment modeled by an unknown finite Markov decision process (MDP), who wishes to maximize a notion of reward accumulated during her experience. Her performance can be measured through the notion of regret, which compares her accumulated expected reward against that achieved by an oracle algorithm always following an optimal behavior. In order to maximize her accumulated reward, or equivalently to minimize the regret, she needs to face a trade-off between exploration and exploitation. The first part of this thesis investigates combinatorial multi-armed ba...
Reinforcement learning (RL) has gained an increasing interest in recent years, being expected to del...
International audienceWe consider a reinforcement learning setting where the learner also has to dea...
Sequentially making-decision abounds in real-world problems ranging from robots needing to interact ...
This thesis investigates sequential decision making tasks that fall in the framework of reinforcemen...
We consider a class of sequential decision making problems in the presence of uncertainty, which bel...
We consider an agent interacting with an environment in a single stream of actions, observations, an...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
International audienceWe consider reinforcement learning in a discrete, undiscounted, infinite-horiz...
International audienceThe problem of reinforcement learning in an unknown and discrete Markov Decisi...
Abstract. Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the...
Abstract. Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the...
We consider the problem of minimizing the long term average expected regret of an agent in an online...
We consider an agent interacting with an en-vironment in a single stream of actions, ob-servations, ...
Multi-Armed Bandits (MAB) constitute the most fundamental model for sequential decision making probl...
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the...
Reinforcement learning (RL) has gained an increasing interest in recent years, being expected to del...
International audienceWe consider a reinforcement learning setting where the learner also has to dea...
Sequentially making-decision abounds in real-world problems ranging from robots needing to interact ...
This thesis investigates sequential decision making tasks that fall in the framework of reinforcemen...
We consider a class of sequential decision making problems in the presence of uncertainty, which bel...
We consider an agent interacting with an environment in a single stream of actions, observations, an...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
International audienceWe consider reinforcement learning in a discrete, undiscounted, infinite-horiz...
International audienceThe problem of reinforcement learning in an unknown and discrete Markov Decisi...
Abstract. Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the...
Abstract. Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the...
We consider the problem of minimizing the long term average expected regret of an agent in an online...
We consider an agent interacting with an en-vironment in a single stream of actions, ob-servations, ...
Multi-Armed Bandits (MAB) constitute the most fundamental model for sequential decision making probl...
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the...
Reinforcement learning (RL) has gained an increasing interest in recent years, being expected to del...
International audienceWe consider a reinforcement learning setting where the learner also has to dea...
Sequentially making-decision abounds in real-world problems ranging from robots needing to interact ...