Mo' States Mo' Problems: Emergency Stop Mechanisms from Observation

Samuel Ainsworth; Matt Barnes; Siddhartha Srinivasa

2019 NIPS NeurIPS 2019

Mo' States Mo' Problems: Emergency Stop Mechanisms from Observation

Abstract

In many environments, only a relatively small subset of the complete state space is necessary in order to accomplish a given task. We develop a simple technique using emergency stops (e-stops) to exploit this phenomenon. Using e-stops significantly improves sample complexity by reducing the amount of required exploration, while retaining a performance bound that efficiently trades off the rate of convergence with a small asymptotic sub-optimality gap. We analyze the regret behavior of e-stops and present empirical results in discrete and continuous settings demonstrating that our reset mechanism can provide order-of-magnitude speedups on top of existing reinforcement learning methods.

🌉 Interdisciplinary Bridge — Deep Learning and Reinforcement Learning

🧭 Keyword Pioneer — emergency stop

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Samuel Ainsworth , Matt Barnes , Siddhartha Srinivasa

Topics

Reinforcement Learning > Methods > Policy Learning Deep Learning > Learning Types > Reinforcement Learning

Keywords

reinforcement learning sample complexity regret bound emergency stop exploration reduction

Download PDF

Related papers

Two Generator Game: Learning to Sample via Linear Goodness-of-Fit Test 2019

Metalearned Neural Memory 2019

Model Similarity Mitigates Test Set Overuse 2019

Continual Unsupervised Representation Learning 2019

Reinforcement Learning with Convex Constraints 2019