Explore the Context: Optimal Data Collection for Context-Conditional Dynamics Models

Jan Achterhold; Joerg Stueckler

2021 AISTATS AISTATS 2021

Explore the Context: Optimal Data Collection for Context-Conditional Dynamics Models

Abstract

In this paper, we learn dynamics models for parametrized families of dynamical systems with varying properties. The dynamics models are formulated as stochastic processes conditioned on a latent context variable which is inferred from observed transitions of the respective system. The probabilistic formulation allows us to compute an action sequence which, for a limited number of environment interactions, optimally explores the given system within the parametrized family. This is achieved by steering the system through transitions being most informative for the context variable. We demonstrate the effectiveness of our method for exploration on a non-linear toy-problem and two well-known reinforcement learning environments.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — latent context variable

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jan Achterhold , Joerg Stueckler

Topics

Artificial Intelligence > Core AI > Planning Machine Learning > Learning Types > Self-Supervised Learning Reinforcement Learning > Applications > Robotics Machine Learning > Learning Types > Reinforcement Learning

Keywords

reinforcement learning stochastic process dynamics model optimal exploration latent context latent context variable probabilistic formulation context variable

Download PDF

Related papers

Linear Regression Games: Convergence Guarantees to Approximate Out-of-Distribution Solutions 2021

Semi-Supervised Learning with Meta-Gradient 2021

Accelerating Metropolis-Hastings with Lightweight Inference Compilation 2021

When MAML Can Adapt Fast and How to Assist When It Cannot 2021

On the convergence of the Metropolis algorithm with fixed-order updates for multivariate binary probability distributions 2021