Fitted Q-iteration (FQI) stands out among reinforcement learning algorithms for its flexibility and ease of use. FQI can be combined with any regression method, and this choice determines the algorithm's statistical and computational properties. The combination of FQI with an ensemble of regression trees gives rise to an algorithm, FQIT, that is computationally efficient, scalable to high dimensional spaces, and robust to noise. Despite its nice properties and good performance in practice, FQIT also has some limitations: the fact that an ensemble of trees must be constructed (or updated) at each iteration confines the algorithm to the batch scenario. This paper aims to address this specific issue. Based on a strategy recently proposed in th...
This paper presents a novel incremental algorithm that combines Q-learning, a well-known dynamic-pro...
This paper presents a novel incremental algorithm that combines Q-learning, a well-known dynamic-pro...
peer reviewedThis paper addresses the problem of computing optimal structured treatment interruption...
Reinforcement learning aims to determine an optimal control policy from interaction with a system or...
International audienceThis paper is about the study of B-FQI, an Approximated Value Iteration (AVI) ...
This paper proposes a novel approach to discover options in the form of stochastic conditionally ter...
A construct that has been receiving attention recently in reinforcement learning is stochastic facto...
In a simulation of an advanced generic cancer trial, I use Q-learning, a reinforcement learning algo...
Abstract — Reinforcement learning with linear and non-linear function approximation has been studied...
Workshop on Safety, Risk and Uncertainty in Reinforcement Learning. https://sites.google.com/view/rl...
Abstract-Reinforcement learning with linear and non-linear function approximation has been studied e...
Recently, fitted Q-iteration (FQI) based methods have become more popular due to their increased sam...
<p>In this article, we introduce a new type of tree-based method, reinforcement learning trees (RLT)...
Recently, fitted Q-iteration (FQI) based methods have become more popular due to their increased sam...
Recently, fitted Q-iteration (FQI) based methods have become more popular due to their increased sam...
This paper presents a novel incremental algorithm that combines Q-learning, a well-known dynamic-pro...
This paper presents a novel incremental algorithm that combines Q-learning, a well-known dynamic-pro...
peer reviewedThis paper addresses the problem of computing optimal structured treatment interruption...
Reinforcement learning aims to determine an optimal control policy from interaction with a system or...
International audienceThis paper is about the study of B-FQI, an Approximated Value Iteration (AVI) ...
This paper proposes a novel approach to discover options in the form of stochastic conditionally ter...
A construct that has been receiving attention recently in reinforcement learning is stochastic facto...
In a simulation of an advanced generic cancer trial, I use Q-learning, a reinforcement learning algo...
Abstract — Reinforcement learning with linear and non-linear function approximation has been studied...
Workshop on Safety, Risk and Uncertainty in Reinforcement Learning. https://sites.google.com/view/rl...
Abstract-Reinforcement learning with linear and non-linear function approximation has been studied e...
Recently, fitted Q-iteration (FQI) based methods have become more popular due to their increased sam...
<p>In this article, we introduce a new type of tree-based method, reinforcement learning trees (RLT)...
Recently, fitted Q-iteration (FQI) based methods have become more popular due to their increased sam...
Recently, fitted Q-iteration (FQI) based methods have become more popular due to their increased sam...
This paper presents a novel incremental algorithm that combines Q-learning, a well-known dynamic-pro...
This paper presents a novel incremental algorithm that combines Q-learning, a well-known dynamic-pro...
peer reviewedThis paper addresses the problem of computing optimal structured treatment interruption...