We study episodic reinforcement learning (RL) in non-stationary linear kernel Markov decision processes (MDPs). In this setting, both the reward function and the transition kernel are linear with respect to the given feature maps and are allowed to vary over time, as long as their respective parameter variations do not exceed certain variation budgets. We propose the $\underline{\text{p}}$eriodically $\underline{\text{r}}$estarted $\underline{\text{o}}$ptimistic $\underline{\text{p}}$olicy $\underline{\text{o}}$ptimization algorithm (PROPO), which is an optimistic policy optimization algorithm with linear function approximation. PROPO features two mechanisms: sliding-window-based policy evaluation and periodic-restart-based policy improveme...
Abstract. We present a Reinforcement Learning (RL) algorithm based on policy iteration for solving a...
We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic no...
Reinforcement learning (RL) has attracted rapidly increasing interest in the machine learning and ar...
Reinforcement learning is a family of machine learning algorithms, in which the system learns to mak...
We consider un-discounted reinforcement learning (RL) in Markov decision processes (MDPs) under dri...
We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic no...
We study reinforcement learning (RL) with linear function approximation. For episodic time-inhomogen...
We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision pr...
We propose Generalized Trust Region Policy Optimization (GTRPO), a policy gradient Reinforcement Lea...
We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision pr...
Reinforcement learning (RL) has emerged as a general-purpose technique for addressing problems invol...
Reinforcement learning (RL) studies the problem where an agent maximizes its cumulative reward throu...
We work towards a unifying paradigm for accelerating policy optimization methods in reinforcement le...
Approximate dynamic programming approaches to the reinforcement learning problem are often categoriz...
International audienceIn this work, we propose KeRNS: an algorithm for episodic reinforcement learni...
Abstract. We present a Reinforcement Learning (RL) algorithm based on policy iteration for solving a...
We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic no...
Reinforcement learning (RL) has attracted rapidly increasing interest in the machine learning and ar...
Reinforcement learning is a family of machine learning algorithms, in which the system learns to mak...
We consider un-discounted reinforcement learning (RL) in Markov decision processes (MDPs) under dri...
We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic no...
We study reinforcement learning (RL) with linear function approximation. For episodic time-inhomogen...
We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision pr...
We propose Generalized Trust Region Policy Optimization (GTRPO), a policy gradient Reinforcement Lea...
We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision pr...
Reinforcement learning (RL) has emerged as a general-purpose technique for addressing problems invol...
Reinforcement learning (RL) studies the problem where an agent maximizes its cumulative reward throu...
We work towards a unifying paradigm for accelerating policy optimization methods in reinforcement le...
Approximate dynamic programming approaches to the reinforcement learning problem are often categoriz...
International audienceIn this work, we propose KeRNS: an algorithm for episodic reinforcement learni...
Abstract. We present a Reinforcement Learning (RL) algorithm based on policy iteration for solving a...
We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic no...
Reinforcement learning (RL) has attracted rapidly increasing interest in the machine learning and ar...