Reglering i realtid med förstärkningsinlärning

Gybäck, Gustav
Röstlund, Fredrik

Publication date

January 2018

Publisher

KTH, Skolan för teknikvetenskap (SCI)

Abstract

We reproduce the Deep Deterministic Policy Gradient algorithm presented in the paper Continuous Control With Deep Reinforcement Learning to verify its results. We also strive to explain the necessary machine learning framework needed to understand the algorithm. It is a model-free, actor-critic algorithm that implements target networks and mini batch learning from a replay buffer to increase stability. Batch normalisation is introduced to make the algorithm versatile and applicable to multiple environments with varying value ranges and physical units. We use neural networks as function approximators to handle the large state and action spaces. We can show that the algorithm can learn and solve multiple environments using the same set up. Af...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Reglering i realtid med förstärkningsinlärning

Abstract

Extracted data

Reglering i realtid med förstärkningsinlärning

Abstract

Extracted data

Related items

Related items