Variational Bayesian Optimization for Runtime Risk-Sensitive Control

Scott Kuindersma; Roderic Grupen; Andrew Barto

2012 RSS RSS 2012

Variational Bayesian Optimization for Runtime Risk-Sensitive Control

Abstract

We present a new Bayesian policy search algorithm suitable for problems with policy-dependent cost variance, a property present in many robot control tasks. We extend recent work on variational heteroscedastic Gaussian processes to the optimization case to achieve efficient minimization of very noisy cost signals. In contrast to most policy search algorithms, our method explicitly models the cost variance in regions of low expected cost and permits runtime adjustment of risk sensitivity without relearning. Our experiments with artificial systems and a real mobile manipulator demonstrate that flexible risk-sensitive policies can be learned in very few trials.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

📈 Trend Setter — Risk Management

🧭 Keyword Pioneer — risk-sensitive control

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Scott Kuindersma , Roderic Grupen , Andrew Barto

Topics

Artificial Intelligence > Bayesian & Probabilistic > Bayesian Learning Machine Learning > Optimization & Theory > Bayesian Inference Machine Learning > Application Areas > Risk Management Machine Learning > Learning Types > Reinforcement Learning Artificial Intelligence > Bayesian & Probabilistic > Bayesian Inference Artificial Intelligence > Core AI > Robotics

Keywords

variational bayesian gaussian process policy search robot control risk-sensitive control heteroscedastic regression variational bayesian optimization

Download PDF

Related papers

Experiments with Balancing on Irregular Terrains using the Dreamer Mobile Humanoid Robot 2012

Toward Information Theoretic Human-Robot Dialog 2012

On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference 2012

Guaranteeing High-Level Behaviors while Exploring Partially Known Maps 2012

M-Width: Stability and Accuracy of Haptic Rendering of Virtual Mass 2012