Active Exploration via Experiment Design in Markov Chains

Mojmir Mutny; Tadeusz Janik; Andreas Krause

2023 AISTATS AISTATS 2023

Active Exploration via Experiment Design in Markov Chains

Abstract

A key challenge in science and engineering is to design experiments to learn about some unknown quantity of interest. Classical experimental design optimally allocates the experimental budget into measurements to maximize a notion of utility (e.g., reduction in uncertainty about the unknown quantity). We consider a rich setting, where the experiments are associated with states in a Markov chain, and we can only choose them by selecting a policy controlling the state transitions. This problem captures important applications, from exploration in reinforcement learning to spatial monitoring tasks. We propose an algorithm – markov-design – that efficiently selects policies whose measurement allocation provably converges to the optimal one. The algorithm is sequential in nature, adapting its choice of policies (experiments) using past measurements. In addition to our theoretical analysis, we demonstrate our framework on applications in ecological surveillance and pharmacology.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐣 Hot Topic Early Bird — policy optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Mojmir Mutny , Tadeusz Janik , Andreas Krause

Topics

Artificial Intelligence > Core AI > Planning Machine Learning > Learning Types > Active Learning Machine Learning > Optimization & Theory > Stochastic Processes Mathematics & Optimization > Optimization > Optimal Control

Keywords

policy optimization sequential decision making markov chain experiment design active exploration optimal allocation

Download PDF

Related papers

Safe Sequential Testing and Effect Estimation in Stratified Count Data 2023

Who Should Predict? Exact Algorithms For Learning to Defer to Humans 2023

An Online and Unified Algorithm for Projection Matrix Vector Multiplication with Application to Empirical Risk Minimization 2023

Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient Methods 2023

The Ordered Matrix Dirichlet for State-Space Models 2023