2024
NIPS
NeurIPS 2024
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
Abstract
This paper studies off-policy evaluation (OPE) in the presence of unmeasured confounders. Inspired by the two-way fixed effects regression model widely used in the panel data literature, we propose a two-way unmeasured confounding assumption to model the system dynamics in causal reinforcement learning and develop a two-way deconfounder algorithm that devises a neural tensor network to simultaneously learn both the unmeasured confounders and the system dynamics, based on which a model-based estimator can be constructed for consistent policy value estimation. We illustrate the effectiveness of the proposed estimator through theoretical results and numerical experiments.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Reinforcement Learning
🧭
Keyword Pioneer
— model-based estimator
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio
Authors
Topics
Artificial Intelligence > Core AI > Causal Inference
Machine Learning > Optimization & Theory > Statistical Learning
Reinforcement Learning > Methods > Deep RL
Reinforcement Learning > Methods > Offline RL
Knowledge & Reasoning > Reasoning > Causal Inference
Machine Learning > Learning Types > Reinforcement Learning
Machine Learning > Bayesian & Probabilistic > Bayesian Inference