This thesis studies the problem of designing reliable control laws of robotic systems operating in uncertain environments. We tackle this issue by using stochastic optimization to iteratively refine the parameters of a control law from a fixed policy class, otherwise known as policy search. We introduce several new approaches to stochastic policy optimization based on probably approximately correct (PAC) bounds on the expected performance of control policies. These algorithms, referred to as PAC Robust Policy Search (PROPS), directly minimize an upper confidence bound on the expected cost of trajectories instead of employing a standard approach based on the expected cost itself. We compare the performance of PROPS to that of existing policy...
International audienceMost policy search (PS) algorithms require thousands of training episodes to f...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2...
Most policy search algorithms require thousands of training episodes to find an effective policy, wh...
A central goal of the robotics community is to develop general optimization algorithms for producing...
For controlling high-dimensional robots, most stochastic optimal control algorithms use approximatio...
With the increasing pace of automation, modern robotic systems need to act in stochastic, non-statio...
Abstract — For controlling high-dimensional robots, most stochastic optimal control algorithms use a...
A central goal of the robotics community is to develop general optimization algorithms for producing...
In both industrial and service domains, a central benefit of the use of robots is their ability to q...
General autonomy is at the forefront of robotic research and practice. Earlier research has enabled ...
Stochastic motion planning is of crucial importance in robotic applications not only because of the ...
How does uncertainty affect a robot when attempting to generate a control policy to achieve some obj...
Data-driven approaches hold the promise of creating the next wave of robots that can perform diverse...
Policy search is a subfield in reinforcement learning which focuses on finding good parameters for ...
The ability to mentally evaluate variations of the future may well be the key to intelligence. Combi...
International audienceMost policy search (PS) algorithms require thousands of training episodes to f...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2...
Most policy search algorithms require thousands of training episodes to find an effective policy, wh...
A central goal of the robotics community is to develop general optimization algorithms for producing...
For controlling high-dimensional robots, most stochastic optimal control algorithms use approximatio...
With the increasing pace of automation, modern robotic systems need to act in stochastic, non-statio...
Abstract — For controlling high-dimensional robots, most stochastic optimal control algorithms use a...
A central goal of the robotics community is to develop general optimization algorithms for producing...
In both industrial and service domains, a central benefit of the use of robots is their ability to q...
General autonomy is at the forefront of robotic research and practice. Earlier research has enabled ...
Stochastic motion planning is of crucial importance in robotic applications not only because of the ...
How does uncertainty affect a robot when attempting to generate a control policy to achieve some obj...
Data-driven approaches hold the promise of creating the next wave of robots that can perform diverse...
Policy search is a subfield in reinforcement learning which focuses on finding good parameters for ...
The ability to mentally evaluate variations of the future may well be the key to intelligence. Combi...
International audienceMost policy search (PS) algorithms require thousands of training episodes to f...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2...
Most policy search algorithms require thousands of training episodes to find an effective policy, wh...