Policy gradient methods

Peters, J.

Open link

Publication date

November 2010

DOI

10.4249/scholarpedia.3698

Publisher

Scholarpedia

Abstract

Policy gradient methods are a type of reinforcement learning techniques that rely upon optimizing parametrized policies with respect to the expected return (long-term cumulative reward) by gradient descent. They do not suffer from many of the problems that have been marring traditional reinforcement learning approaches such as the lack of guarantees of a value function, the intractability problem resulting from uncertain state information and the complexity arising from continuous states actions

Extracted data

We use cookies to provide a better user experience.

Data Protection

Policy gradient methods

Abstract

Extracted data

Policy gradient methods

Abstract

Extracted data

Related items

Related items