Scalable Solutions for Decision-Making Systems Using Explainable Policy Representations
Abstract
Abstract Despite significant advancements in solving Markov Decision Processes (MDPs) and Simple Stochastic Games (SGs), scalability remains a challenge due to the exponential growth of their state spaces. This thesis aims to push the boundaries of state-of-the-art methods by tackling this issue using 1) explainability and 2) exploiting the model structure. First, we introduce the *1-2-3-Go* approach, which learns explainable policies from small MDP models and generalizes them to larger instances, improving scalability in MDPs. We then extend *Optimistic Value Iteration (OVI)* and *Sound Value Iteration (SVI)*—originally designed for MDPs—to SGs, improving efficiency in adversarial settings. Finally, we aim to exploit the *explainable policy representations* and the *model structure* to enhance both scalability and interpretability in SGs. This thesis contributes to both theoretical advancements and practical solutions for decision-making systems under uncertainty.