Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes

Maxime Bouton; Jana Tumova; Mykel J. Kochenderfer

2020 AAAI AAAI 2020

Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes

Abstract

Abstract Autonomous systems are often required to operate in partially observable environments. They must reliably execute a specified objective even with incomplete information about the state of the environment. We propose a methodology to synthesize policies that satisfy a linear temporal logic formula in a partially observable Markov decision process (POMDP). By formulating a planning problem, we show how to use point-based value iteration methods to efficiently approximate the maximum probability of satisfying a desired logical formula and compute the associated belief state policy. We demonstrate that our method scales to large POMDP domains and provides strong bounds on the performance of the resulting policy.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization and Reinforcement Learning

🧭 Keyword Pioneer — belief state policy

🐣 Hot Topic Early Bird — partially observable markov decision process

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Maxime Bouton , Jana Tumova , Mykel J. Kochenderfer

Topics

Artificial Intelligence > Core AI > Planning Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Applications > Value Iteration Machine Learning > Learning Types > Reinforcement Learning Mathematics & Optimization > Optimization > Optimization Mathematics & Optimization > Optimization > Optimal Control

Keywords

belief state value iteration partially observable markov decision process point-based value iteration model checking policy synthesis autonomous system linear temporal logic belief state policy

Download PDF

Related papers

Enhancing Pointer Network for Sentence Ordering with Pairwise Ordering Predictions 2020

CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning 2020

Neural Simile Recognition with Cyclic Multitask Learning and Local Attention 2020

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy 2020

Multi-Point Semantic Representation for Intent Classification 2020