Maximum Mean Discrepancy Imitation Learning

Beomjoon Kim; Joelle Pineau

2013 RSS RSS 2013

Maximum Mean Discrepancy Imitation Learning

Abstract

Imitation learning is an efficient method for many robots to acquire complex skills. Some recent approaches to imitation learning provide strong theoretical performance guarantees. However, there remain crucial practical issues, especially during the training phase, where the training strategy may require execution of control policies that are possibly harmful to the robot or its environment. Moreover, these algorithms often require more demonstrations than necessary to achieve a good performance in practice. This paper introduces a new approach called Maximum Mean Discrepancy Imitation Learning that uses fewer demonstrations and safer exploration policy than existing methods, while preserving strong theoretical guarantees on performance. We demonstrate empirical performance of this method for effective navigation control of a social robot in a populated environment, where safety and efficiency during learning are primary considerations.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — safe exploration

🐣 Hot Topic Early Bird — imitation learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

📈 Trend Setter — Navigation

Authors

Beomjoon Kim , Joelle Pineau

Topics

Machine Learning > Core Methods > Metric Learning Reinforcement Learning > Applications > Robotics Robotics > Capabilities > Navigation Machine Learning > Learning Types > Representation Learning Machine Learning > Learning Types > Imitation Learning

Keywords

imitation learning robot navigation policy learning maximum mean discrepancy safe exploration social robot

Download PDF

Related papers

Realtime Registration-Based Tracking via Approximate Nearest Neighbour Search 2013

Metastability for High-Dimensional Walking Systems on Stochastically Rough Terrain 2013

Deep Learning for Detecting Robotic Grasps 2013

Sorry Dave, I'm Afraid I Can't Do That: Explaining Unachievable Robot Tasks Using Natural Language 2013

Convex Optimization of Nonlinear Feedback Controllers via Occupation Measures 2013