Optimizing Long-term Predictions for Model-based Policy Search

Andreas Doerr; Christian Daniel; Duy Nguyen-tuong; Alonso Marco; Stefan Schaal; Toussaint Marc; Sebastian Trimpe

2017 CORL CoRL 2017

Optimizing Long-term Predictions for Model-based Policy Search

Abstract

We propose a novel long-term optimization criterion to improve the robustness of model-based reinforcement learning in real-world scenarios. Learning a dynamics model to derive a solution promises much greater data-efficiency and reusability compared to model-free alternatives. In practice, however, modelbased RL suffers from various imperfections such as noisy input and output data, delays and unmeasured (latent) states. To achieve higher resilience against such effects, we propose to optimize a generative long-term prediction model directly with respect to the likelihood of observed trajectories as opposed to the common approach of optimizing a dynamics model for one-step-ahead predictions. We evaluate the proposed method on several artificial and real-world benchmark problems and compare it to PILCO, a model-based RL framework, in experiments on a manipulation robot. The results show that the proposed method is competitive compared to state-of-the-art model learning methods. In contrast to these more involved models, our model can directly be employed for policy search and outperforms a baseline method in the robot experiment.

🚀 Conference Pioneer — CORL 2017

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning and Robotics

🧭 Keyword Pioneer — long-term prediction

🐣 Hot Topic Early Bird — generative model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Andreas Doerr , Christian Daniel , Duy Nguyen-tuong , Alonso Marco , Stefan Schaal , Toussaint Marc , Sebastian Trimpe

Topics

Reinforcement Learning > Methods > Deep RL Robotics > Capabilities > Manipulation Machine Learning > Learning Types > Reinforcement Learning Machine Learning > Optimization & Theory > Probabilistic Modeling Deep Learning > Learning Types > Reinforcement Learning

Keywords

policy search model-based reinforcement learning generative model dynamics model long-term prediction trajectory likelihood

Download PDF

Related papers

CORe50: a New Dataset and Benchmark for Continuous Object Recognition 2017

Active Incremental Learning of Robot Movement Primitives 2017

Efficient Automatic Perception System Parameter Tuning On Site without Expert Supervision 2017

Opportunistic Active Learning for Grounding Natural Language Descriptions 2017

Adaptable Pouring: Teaching Robots Not to Spill using Fast but Approximate Fluid Simulation 2017