International audienceIn this work, we propose KeRNS: an algorithm for episodic reinforcement learning in nonstationary Markov Decision Processes (MDPs) whose state-action set is endowed with a metric. Using a non-parametric model of the MDP built with time-dependent kernels, we prove a regret bound that scales with the covering dimension of the state-action space and the total variation of the MDP with time, which quantifies its level of non-stationarity. Our method generalizes previous approaches based on sliding windows and exponential discounting used to handle changing environments. We further propose a practical implementation of KeRNS, we analyze its regret and validate it experimentally
Abstract. We present a kernel-based approach to reinforcement learning that overcomes the stability ...
Learning to act optimally in the complex world has long been a major goal in artificial intelligence...
Reinforcement learning (RL) has emerged as a general-purpose technique for addressing problems invol...
International audienceWe consider the exploration-exploitation dilemma in finite-horizon reinforceme...
We consider un-discounted reinforcement learning (RL) in Markov decision processes (MDPs) under dri...
Reinforcement learning in non-stationary environments is generally regarded as a very difficult prob...
Reinforcement learning (RL) studies the problem where an agent maximizes its cumulative reward throu...
International audienceWe consider an agent interacting with an environment in a single stream of act...
We consider the regret minimization problem in reinforcement learning (RL) in the episodic setting. ...
We study reinforcement learning for continuous-time Markov decision processes (MDPs) in the finite-h...
We study episodic reinforcement learning (RL) in non-stationary linear kernel Markov decision proces...
We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic no...
International audienceWe consider the problem of online reinforcement learning when several state re...
We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic no...
International audienceThe problem of reinforcement learning in an unknown and discrete Markov Decisi...
Abstract. We present a kernel-based approach to reinforcement learning that overcomes the stability ...
Learning to act optimally in the complex world has long been a major goal in artificial intelligence...
Reinforcement learning (RL) has emerged as a general-purpose technique for addressing problems invol...
International audienceWe consider the exploration-exploitation dilemma in finite-horizon reinforceme...
We consider un-discounted reinforcement learning (RL) in Markov decision processes (MDPs) under dri...
Reinforcement learning in non-stationary environments is generally regarded as a very difficult prob...
Reinforcement learning (RL) studies the problem where an agent maximizes its cumulative reward throu...
International audienceWe consider an agent interacting with an environment in a single stream of act...
We consider the regret minimization problem in reinforcement learning (RL) in the episodic setting. ...
We study reinforcement learning for continuous-time Markov decision processes (MDPs) in the finite-h...
We study episodic reinforcement learning (RL) in non-stationary linear kernel Markov decision proces...
We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic no...
International audienceWe consider the problem of online reinforcement learning when several state re...
We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic no...
International audienceThe problem of reinforcement learning in an unknown and discrete Markov Decisi...
Abstract. We present a kernel-based approach to reinforcement learning that overcomes the stability ...
Learning to act optimally in the complex world has long been a major goal in artificial intelligence...
Reinforcement learning (RL) has emerged as a general-purpose technique for addressing problems invol...