A comparison of action selection methods for implicit policy method reinforcement learning in continuous action-space

Nichols, Barry D.

Open PDF

Open link

Publication date

November 2016

DOI

10.1109/IJCNN.2016.7727688

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Abstract

In this paper I investigate methods of applying reinforcement learning to continuous state- and action-space problems without a policy function. I compare the performance of four methods, one of which is the discretisation of the action-space, and the other three are optimisation techniques applied to finding the greedy action without discretisation. The optimisation methods I apply are gradient descent, Nelder-Mead and Newton's Method. The action selection methods are applied in conjunction with the SARSA algorithm, with a multilayer perceptron utilized for the approximation of the value function. The approaches are applied to two simulated continuous state- and action-space control problems: Cart-Pole and double Cart-Pole. The results are...

Extracted data

We use cookies to provide a better user experience.

Data Protection

A comparison of action selection methods for implicit policy method reinforcement learning in continuous action-space

Abstract

Extracted data

A comparison of action selection methods for implicit policy method reinforcement learning in continuous action-space

Abstract

Extracted data

Related items

Related items