Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making

Chengchun Shi; Runzhe Wan; Rui Song; Wenbin Lu; Ling Leng

2020 ICML ICML 2020

Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making

Abstract

The Markov assumption (MA) is fundamental to the empirical validity of reinforcement learning. In this paper, we propose a novel Forward-Backward Learning procedure to test MA in sequential decision making. The proposed test does not assume any parametric form on the joint distribution of the observed data and plays an important role for identifying the optimal policy in high-order Markov decision processes (MDPs) and partially observable MDPs. Theoretically, we establish the validity of our test. Empirically, we apply our test to both synthetic datasets and a real data example from mobile health studies to illustrate its usefulness.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — partially observable mdp

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

📈 Trend Setter — Value Iteration

Authors

Chengchun Shi , Runzhe Wan , Rui Song , Wenbin Lu , Ling Leng

Topics

Artificial Intelligence > Core AI > Causal Inference Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Learning Types > Reinforcement Learning Machine Learning > Optimization & Theory > Statistics Artificial Intelligence > Core AI > Reasoning Reinforcement Learning > Methods > Value Iteration

Keywords

reinforcement learning sequential decision making markov decision process hypothesis testing statistical testing markov property partially observable mdp forward-backward learning

Download PDF

Related papers

Correlation Clustering with Asymmetric Classification Errors 2020

Learning Portable Representations for High-Level Planning 2020

Proving the Lottery Ticket Hypothesis: Pruning is All You Need 2020

Minimax Pareto Fairness: A Multi Objective Perspective 2020

DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training 2020