Formally Verified Solution Methods for Markov Decision Processes

Maximilian Schäffeler; Mohammad Abdulaziz

2023 AAAI AAAI 2023

Formally Verified Solution Methods for Markov Decision Processes

Abstract

Abstract We formally verify executable algorithms for solving Markov decision processes (MDPs) in the interactive theorem prover Isabelle/HOL. We build on existing formalizations of probability theory to analyze the expected total reward criterion on finite and infinite-horizon problems. Our developments formalize the Bellman equation and give conditions under which optimal policies exist. Based on this analysis, we verify dynamic programming algorithms to solve tabular MDPs. We evaluate the formally verified implementations experimentally on standard problems, compare them with state-of-the-art systems, and show that they are practical.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization and Reinforcement Learning

🧭 Keyword Pioneer — expected total reward

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Maximilian Schäffeler , Mohammad Abdulaziz

Topics

Machine Learning > Optimization & Theory > Theory Reinforcement Learning > Applications > Value Iteration Machine Learning > Bayesian & Probabilistic > Probabilistic Modeling Machine Learning > Learning Types > Reinforcement Learning Mathematics & Optimization > Optimization > Optimization

Keywords

formal verification markov decision process bellman equation dynamic programming probability theory expected total reward

Download PDF

Related papers

A Model-Agnostic Heuristics for Selective Classification 2023

Tackling Safe and Efficient Multi-Agent Reinforcement Learning via Dynamic Shielding (Student Abstract) 2023

Head-Free Lightweight Semantic Segmentation with Linear Transformer 2023

Hierarchical ConViT with Attention-Based Relational Reasoner for Visual Analogical Reasoning 2023

Deep Spiking Neural Networks with High Representation Similarity Model Visual Pathways of Macaque and Mouse 2023