Ecient Planning in MDPs by Small Backups Harm van Seijen

Richard S. Sutton

Publication date

December 2014

Abstract

Ecient planning plays a crucial role in model-based reinforcement learning. Tradi-tionally, the main planning operation is a full backup based on the current estimates of the successor states. Consequently, its com-putation time is proportional to the num-ber of successor states. In this paper, we introduce a new planning backup that uses only the current value of a single successor state and has a computation time indepen-dent of the number of successor states. This new backup, which we call a small backup, opens the door to a new class of model-based reinforcement learning methods that exhibit much finer control over their planning process than traditional methods. We empirically demonstrate that this increased flexibility al-lows for mor...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Ecient Planning in MDPs by Small Backups Harm van Seijen

Abstract

Extracted data

Ecient Planning in MDPs by Small Backups Harm van Seijen

Abstract

Extracted data

Related items

Related items