Reinforcement learning is a hard problem and the majority of the existing algorithms su#er from poor convergence properties for di#cult problems. In this paper we propose a new reinforcement learning method, that utilizes the power of global optimization methods such as simulated annealing. Specifically, we use a particularly powerful version of simulated annealing called Adaptive Simulated Annealing (ASA) [3]. Towards this end we consider a batch formulation for the reinforcement learning problem, unlike the online formulation almost always used. The advantage of the batch formulation is that it allows state-of-the-art optimization procedures to be employed, and thus can lead to 1 further improvements in algorithmic convergence p...
In the Artificial Bee Colony (ABC) algorithm, the employed bee and the onlooker bee phase involve up...
The balance between exploration and exploitation is one of the key problems of action selection in Q...
International audienceFinding the global minimum of a nonconvex optimization problem is a notoriousl...
A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide ...
Summarization: Many computational problems can be solved by multiple algorithms, with different algo...
Training agents over sequences of tasks is often employed in deep reinforcement learning to let the ...
Simulated Annealing is a meta-heuristic that performs a randomized local search to reach near-optima...
Abstract: We introduce the Optimal Sample Selection (OSS) meta-algorithm for solving discrete-time O...
Reinforcement learning is defined as the problem of an agent that learns to perform a certain task t...
evaluation functions, local search, heuristic search, simulated annealing, value function approximat...
We address the problem of non-convergence of online reinforcement learning algorithms (e.g., Q learn...
In this paper, a new algorithm based on case base reasoning and reinforcement learning (RL) is propo...
In recent years, reinforcement learning has become incredibly popular as a method to find good solut...
Any nonassociative reinforcement learning algorithm can be viewed as a method for performing functio...
Adaptive simulated annealing (ASA) is a global optimization algorithm based on an associated proof ...
In the Artificial Bee Colony (ABC) algorithm, the employed bee and the onlooker bee phase involve up...
The balance between exploration and exploitation is one of the key problems of action selection in Q...
International audienceFinding the global minimum of a nonconvex optimization problem is a notoriousl...
A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide ...
Summarization: Many computational problems can be solved by multiple algorithms, with different algo...
Training agents over sequences of tasks is often employed in deep reinforcement learning to let the ...
Simulated Annealing is a meta-heuristic that performs a randomized local search to reach near-optima...
Abstract: We introduce the Optimal Sample Selection (OSS) meta-algorithm for solving discrete-time O...
Reinforcement learning is defined as the problem of an agent that learns to perform a certain task t...
evaluation functions, local search, heuristic search, simulated annealing, value function approximat...
We address the problem of non-convergence of online reinforcement learning algorithms (e.g., Q learn...
In this paper, a new algorithm based on case base reasoning and reinforcement learning (RL) is propo...
In recent years, reinforcement learning has become incredibly popular as a method to find good solut...
Any nonassociative reinforcement learning algorithm can be viewed as a method for performing functio...
Adaptive simulated annealing (ASA) is a global optimization algorithm based on an associated proof ...
In the Artificial Bee Colony (ABC) algorithm, the employed bee and the onlooker bee phase involve up...
The balance between exploration and exploitation is one of the key problems of action selection in Q...
International audienceFinding the global minimum of a nonconvex optimization problem is a notoriousl...