Scalable Solutions for Decision-Making Systems Using Explainable Policy Representations

Muqsit Azeem

2025 AAAI AAAI 2025

Scalable Solutions for Decision-Making Systems Using Explainable Policy Representations

Abstract

Abstract Despite significant advancements in solving Markov Decision Processes (MDPs) and Simple Stochastic Games (SGs), scalability remains a challenge due to the exponential growth of their state spaces. This thesis aims to push the boundaries of state-of-the-art methods by tackling this issue using 1) explainability and 2) exploiting the model structure. First, we introduce the *1-2-3-Go* approach, which learns explainable policies from small MDP models and generalizes them to larger instances, improving scalability in MDPs. We then extend *Optimistic Value Iteration (OVI)* and *Sound Value Iteration (SVI)*—originally designed for MDPs—to SGs, improving efficiency in adversarial settings. Finally, we aim to exploit the *explainable policy representations* and the *model structure* to enhance both scalability and interpretability in SGs. This thesis contributes to both theoretical advancements and practical solutions for decision-making systems under uncertainty.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Mathematics & Optimization

🧭 Keyword Pioneer — explainable policy

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Muqsit Azeem

Topics

Artificial Intelligence > Core AI > Interpretability Artificial Intelligence > Core AI > Planning Mathematics & Optimization > Optimization > Game Theory

Keywords

markov decision process value iteration adversarial setting stochastic game explainable policy scalable solution simple stochastic game

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025