MAGIC: Learning Macro-Actions for Online POMDP Planning

Yiyuan Lee; Panpan Cai; David Hsu

2021 RSS RSS 2021

MAGIC: Learning Macro-Actions for Online POMDP Planning

Abstract

The partially observable Markov decision process (POMDP) is a principled general framework for robot decision making under uncertainty; but POMDP planning suffers from high computational complexity; when long-term planning is required. While temporally-extended macro-actions help to cut down the effective planning horizon and significantly improve computational efficiency; how do we acquire good macro-actions? This paper proposes Macro-Action Generator-Critic (MAGIC); which performs offline learning of macro-actions optimized for online POMDP planning. Specifically; MAGIC learns a macro-action generator end-to-end; using an online planner's performance as the feedback. During online planning; the generator generates on the fly situation-aware macro-actions conditioned on the robot's belief and the environment context. We evaluated MAGIC on several long-horizon planning tasks both in simulation and on a real robot. The experimental results show that the learned macro-actions offer significant benefits in online planning performance; compared with primitive actions and handcrafted macro-actions.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning

🧭 Keyword Pioneer — robot decision making

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

Authors

Yiyuan Lee , Panpan Cai , David Hsu

Topics

Artificial Intelligence > Core AI > Planning Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Applications > Robotics

Keywords

hierarchical planning belief state online planning pomdp planning robot decision making

Download PDF

Related papers

Resolving Conflict in Decision-Making for Autonomous Driving 2021

Variational Inference MPC using Tsallis Divergence 2021

Jerk-limited Real-time Trajectory Generation with Arbitrary Target States 2021

Sampling-Based Motion Planning on Sequenced Manifolds 2021

Real-Time Multi-View 3D Human Pose Estimation using Semantic Feedback to Smart Edge Sensors 2021