peer reviewedWe introduce the Optimal Sample Selection (OSS) meta-algorithm for solving discrete-time Optimal Control problems. This meta-algorithm maps the problem of finding a near-optimal closed-loop policy to the identification of a small set of one-step system transitions, leading to high-quality policies when used as input of a batch-mode Reinforcement Learning (RL) algorithm. We detail a particular instance of this OSS metaalgorithm that uses tree-based Fitted Q-Iteration as a batch-mode RL algorithm and Cross Entropy search as a method for navigating efficiently in the space of sample sets. The results show that this particular instance of OSS algorithms is able to identify rapidly small sample sets leading to high-quality policie
In this paper, we study the optimal stopping problem in the so-called exploratory framework, in whic...
Many computational problems can be solved by multiple algorithms, with different algorithms fastest ...
In the field of Reinforcement Learning, Markov Decision Processes with a finite number of states and...
Abstract: We introduce the Optimal Sample Selection (OSS) meta-algorithm for solving discrete-time O...
Reinforcement learning aims to determine an optimal control policy from interaction with a system or...
In this article we study the connection of stochastic optimal control and reinforcement learning. Ou...
A common setting of reinforcement learning (RL) is a Markov decision process (MDP) in which the envi...
This dissertation presents various research contributions published during these four years of PhD i...
International audienceIn reinforcement learning, an agent collects information interacting with an e...
Achieving sample efficiency in online episodic reinforcement learning (RL) requires optimally balanc...
Control engineering researchers are increasingly embracing data-driven techniques like reinforcement...
We consider batch reinforcement learning problems in continuous space, expected total discounted-rew...
International audienceOne of the challenges in online reinforcement learning (RL) is that the agent ...
The curse of dimensionality is a widely known issue in reinforcement learning (RL). In the tabular s...
peer reviewedWe propose new methods for guiding the generation of informative trajectories when solv...
In this paper, we study the optimal stopping problem in the so-called exploratory framework, in whic...
Many computational problems can be solved by multiple algorithms, with different algorithms fastest ...
In the field of Reinforcement Learning, Markov Decision Processes with a finite number of states and...
Abstract: We introduce the Optimal Sample Selection (OSS) meta-algorithm for solving discrete-time O...
Reinforcement learning aims to determine an optimal control policy from interaction with a system or...
In this article we study the connection of stochastic optimal control and reinforcement learning. Ou...
A common setting of reinforcement learning (RL) is a Markov decision process (MDP) in which the envi...
This dissertation presents various research contributions published during these four years of PhD i...
International audienceIn reinforcement learning, an agent collects information interacting with an e...
Achieving sample efficiency in online episodic reinforcement learning (RL) requires optimally balanc...
Control engineering researchers are increasingly embracing data-driven techniques like reinforcement...
We consider batch reinforcement learning problems in continuous space, expected total discounted-rew...
International audienceOne of the challenges in online reinforcement learning (RL) is that the agent ...
The curse of dimensionality is a widely known issue in reinforcement learning (RL). In the tabular s...
peer reviewedWe propose new methods for guiding the generation of informative trajectories when solv...
In this paper, we study the optimal stopping problem in the so-called exploratory framework, in whic...
Many computational problems can be solved by multiple algorithms, with different algorithms fastest ...
In the field of Reinforcement Learning, Markov Decision Processes with a finite number of states and...