This paper addresses the problem of learning control policies for mobile robots, modeled as unknown Markov Decision Processes (MDPs), that are tasked with temporal logic missions, such as sequencing, coverage, or surveillance. The MDP captures uncertainty in the workspace structure and the outcomes of control decisions. The control objective is to synthesize a control policy that maximizes the probability of accomplishing a high-level task, specified as a Linear Temporal Logic (LTL) formula. To address this problem, we propose a novel accelerated model-based reinforcement learning (RL) algorithm for LTL control objectives that is capable of learning control policies significantly faster than related approaches. Its sample-efficiency relies ...
In this paper, we develop a method to automatically generate a control policy for a dynamical system...
Reactive synthesis algorithms allow automatic construction of policies to control an environment mod...
Reinforcement learning (RL) is a promising approach. However, success is limited to real-world appli...
We present a model-free reinforcement learning algorithm to synthesize control policies that maximiz...
We present a model-free reinforcement learning algorithm to synthesize control policies that maximiz...
Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied...
Ensuring safety and meeting temporal specifications are critical challenges for long-term robotic ta...
Unlike the standard Reinforcement Learning (RL) model, many real-world tasks are non-Markovian, whos...
We consider the problem of steering a system with unknown, stochastic dynamics to satisfy a ri...
In recent years, researchers have made significant progress in devising reinforcement-learning algor...
Synthesis from linear temporal logic (LTL) specifications provides assured controllers for systems o...
Linear temporal logic (LTL) and omega-regular objectives -- a superset of LTL -- have seen recent us...
We propose to synthesize a control policy for a Markov decision process (MDP) such that the resultin...
Abstract—We consider synthesis of control policies that maxi-mize the probability of satisfying give...
Abstract — We propose to synthesize a control policy for a Markov decision process (MDP) such that t...
In this paper, we develop a method to automatically generate a control policy for a dynamical system...
Reactive synthesis algorithms allow automatic construction of policies to control an environment mod...
Reinforcement learning (RL) is a promising approach. However, success is limited to real-world appli...
We present a model-free reinforcement learning algorithm to synthesize control policies that maximiz...
We present a model-free reinforcement learning algorithm to synthesize control policies that maximiz...
Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied...
Ensuring safety and meeting temporal specifications are critical challenges for long-term robotic ta...
Unlike the standard Reinforcement Learning (RL) model, many real-world tasks are non-Markovian, whos...
We consider the problem of steering a system with unknown, stochastic dynamics to satisfy a ri...
In recent years, researchers have made significant progress in devising reinforcement-learning algor...
Synthesis from linear temporal logic (LTL) specifications provides assured controllers for systems o...
Linear temporal logic (LTL) and omega-regular objectives -- a superset of LTL -- have seen recent us...
We propose to synthesize a control policy for a Markov decision process (MDP) such that the resultin...
Abstract—We consider synthesis of control policies that maxi-mize the probability of satisfying give...
Abstract — We propose to synthesize a control policy for a Markov decision process (MDP) such that t...
In this paper, we develop a method to automatically generate a control policy for a dynamical system...
Reactive synthesis algorithms allow automatic construction of policies to control an environment mod...
Reinforcement learning (RL) is a promising approach. However, success is limited to real-world appli...