Abstract We introduce a novel approach to preference-based reinforcement learning, namely a preference-based variant of a direct policy search method based on evolutionary optimization. The core of our approach is a preference-based racing algorithm that selects the best among a given set of candidate policies with high probability. To this end, the algorithm operates on a suitable ordinal preference structure and only uses pairwise comparisons between sam-ple rollouts of the policies. Embedding the racing algorithm in a rank-based evolutionary search procedure, we show that approximations of the so-called Smith set of optimal policies can be produced with certain theoretical guar-antees. Apart from a formal performance and complexity analy...
International audienceThis research reports on the recent development of black-box optimization meth...
Abstract. This paper focuses on reinforcement learning (RL) with lim-ited prior knowledge. In the do...
Abstract. This paper focuses on reinforcement learning (RL) with lim-ited prior knowledge. In the do...
International audienceWe introduce a novel approach to preference-based reinforcement learn-ing, nam...
This paper makes a first step toward the integration of two subfields of machine learning, namely pr...
This paper makes a first step toward the integration of two subfields of machine learning, namely pr...
Conventional reinforcement learning algorithms for direct policy search are limited to finding only ...
We propose a generic approach to evolutionary optimization that is suitable for problems in which ca...
Reinforcement Learning (RL) problems appear in diverse real-world applications and are gaining subst...
Direct policy search is a practical way to solve reinforcement learning (RL) problems involving con...
Specifying a numeric reward function for reinforcement learning typically requires a lot of hand-tun...
This paper focuses on a class of reinforcement learning problems where significant events are rare a...
Policy search is a successful approach to reinforcement learning. However, policy improvements often...
We reveal a link between particle filtering methods and direct policy search reinforcement learning,...
International audienceThis paper focuses on a class of reinforcement learning problems where signifi...
International audienceThis research reports on the recent development of black-box optimization meth...
Abstract. This paper focuses on reinforcement learning (RL) with lim-ited prior knowledge. In the do...
Abstract. This paper focuses on reinforcement learning (RL) with lim-ited prior knowledge. In the do...
International audienceWe introduce a novel approach to preference-based reinforcement learn-ing, nam...
This paper makes a first step toward the integration of two subfields of machine learning, namely pr...
This paper makes a first step toward the integration of two subfields of machine learning, namely pr...
Conventional reinforcement learning algorithms for direct policy search are limited to finding only ...
We propose a generic approach to evolutionary optimization that is suitable for problems in which ca...
Reinforcement Learning (RL) problems appear in diverse real-world applications and are gaining subst...
Direct policy search is a practical way to solve reinforcement learning (RL) problems involving con...
Specifying a numeric reward function for reinforcement learning typically requires a lot of hand-tun...
This paper focuses on a class of reinforcement learning problems where significant events are rare a...
Policy search is a successful approach to reinforcement learning. However, policy improvements often...
We reveal a link between particle filtering methods and direct policy search reinforcement learning,...
International audienceThis paper focuses on a class of reinforcement learning problems where signifi...
International audienceThis research reports on the recent development of black-box optimization meth...
Abstract. This paper focuses on reinforcement learning (RL) with lim-ited prior knowledge. In the do...
Abstract. This paper focuses on reinforcement learning (RL) with lim-ited prior knowledge. In the do...