Whether animals behave optimally is an open question of great importance, both theoretically and in practice. Attempts to answer this question focus on two aspects of the optimization problem, the quantity to be optimized and the optimization process itself. In this paper, we assume the abstract concept of cost as the quantity to be minimized and propose a reinforcement learning algorithm, called Value-Gradient Learning (VGL), as a computational model of behavior optimality. We prove that, unlike standard models of Reinforcement Learning, Temporal Difference in particular, VGL is guaranteed to converge to optimality under certain conditions. The core of the proof is the mathematical equivalence of VGL and Pontryagin?s Minimum Principle, a w...
Reinforcement learning is often done using parameterized function approximators to store value funct...
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on ...
Many popular optimization algorithms, like the Levenberg-Marquardt algorithm (LMA), use heuristic-ba...
Whether animals behave optimally is an open question of great importance, both theoretically and in ...
: In this article we propose a general framework for sequential decision making. The framework is ba...
In this contribution, we discuss Reinforcement Learning as an alternative way to solve optimal contr...
There are several reinforcement learning algorithms that yield ap-proximate solutions for the proble...
In this chapter, we extend the ADP algorithm, dual heuristic programming (DHP), to include a “bootst...
AbstractThe impulsive preference of an animal for an immediate reward implies that it might subjecti...
This thesis is mostly focused on reinforcement learning, which is viewed as an optimization problem:...
It is well known that, in one form or another, the variational Principle of Least Action (PLA) gover...
Reinforcement Learning is a branch of Artificial Intelligence addressing the problem of single-agent...
Approximate dynamic programming approaches to the reinforcement learning problem are often categoriz...
This thesis presents a novel hierarchical learning framework, Reinforcement Learning Optimal Control...
There exist a number of reinforcement learning algorithms which learn by climbing the gradient of ex...
Reinforcement learning is often done using parameterized function approximators to store value funct...
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on ...
Many popular optimization algorithms, like the Levenberg-Marquardt algorithm (LMA), use heuristic-ba...
Whether animals behave optimally is an open question of great importance, both theoretically and in ...
: In this article we propose a general framework for sequential decision making. The framework is ba...
In this contribution, we discuss Reinforcement Learning as an alternative way to solve optimal contr...
There are several reinforcement learning algorithms that yield ap-proximate solutions for the proble...
In this chapter, we extend the ADP algorithm, dual heuristic programming (DHP), to include a “bootst...
AbstractThe impulsive preference of an animal for an immediate reward implies that it might subjecti...
This thesis is mostly focused on reinforcement learning, which is viewed as an optimization problem:...
It is well known that, in one form or another, the variational Principle of Least Action (PLA) gover...
Reinforcement Learning is a branch of Artificial Intelligence addressing the problem of single-agent...
Approximate dynamic programming approaches to the reinforcement learning problem are often categoriz...
This thesis presents a novel hierarchical learning framework, Reinforcement Learning Optimal Control...
There exist a number of reinforcement learning algorithms which learn by climbing the gradient of ex...
Reinforcement learning is often done using parameterized function approximators to store value funct...
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on ...
Many popular optimization algorithms, like the Levenberg-Marquardt algorithm (LMA), use heuristic-ba...