Markov Decision Processes with Time-Varying Geometric Discounting

Jiarui Gan; Annika Hennes; Rupak Majumdar; Debmalya Mandal; Goran Radanovic

2023 AAAI AAAI 2023

Markov Decision Processes with Time-Varying Geometric Discounting

Abstract

Abstract Canonical models of Markov decision processes (MDPs) usually consider geometric discounting based on a constant discount factor. While this standard modeling approach has led to many elegant results, some recent studies indicate the necessity of modeling time-varying discounting in certain applications. This paper studies a model of infinite-horizon MDPs with time-varying discount factors. We take a game-theoretic perspective – whereby each time step is treated as an independent decision maker with their own (fixed) discount factor – and we study the subgame perfect equilibrium (SPE) of the resulting game as well as the related algorithmic problems. We present a constructive proof of the existence of an SPE and demonstrate the EXPTIME-hardness of computing an SPE. We also turn to the approximate notion of epsilon-SPE and show that an epsilon-SPE exists under milder assumptions. An algorithm is presented to compute an epsilon-SPE, of which an upper bound of the time complexity, as a function of the convergence property of the time-varying discount factor, is provided.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Mathematics & Optimization and Reinforcement Learning

🧭 Keyword Pioneer — time-varying discounting

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Jiarui Gan , Annika Hennes , Rupak Majumdar , Debmalya Mandal , Goran Radanovic

Topics

Machine Learning > Optimization & Theory > Theory Reinforcement Learning > Applications > Value Iteration Machine Learning > Learning Types > Reinforcement Learning Mathematics & Optimization > Optimization > Game Theory Artificial Intelligence > Core AI > Game Theory

Keywords

game theory markov decision process subgame perfect equilibrium time-varying discounting exponential time complexity exponential time

Download PDF

Related papers

A Model-Agnostic Heuristics for Selective Classification 2023

Tackling Safe and Efficient Multi-Agent Reinforcement Learning via Dynamic Shielding (Student Abstract) 2023

Head-Free Lightweight Semantic Segmentation with Linear Transformer 2023

Hierarchical ConViT with Attention-Based Relational Reasoner for Visual Analogical Reasoning 2023

Deep Spiking Neural Networks with High Representation Similarity Model Visual Pathways of Macaque and Mouse 2023