Fast Counterfactual Inference for History-Based Reinforcement Learning

Haichuan Gao; Tianren Zhang; Zhile Yang; Yuqing Guo; Jinsheng Ren; Shangqi Guo; Feng Chen

2023 AAAI AAAI 2023

Fast Counterfactual Inference for History-Based Reinforcement Learning

Abstract

Abstract Incorporating sequence-to-sequence models into history-based Reinforcement Learning (RL) provides a general way to extend RL to partially-observable tasks. This method compresses history spaces according to the correlations between historical observations and the rewards. However, they do not adjust for the confounding correlations caused by data sampling and assign high beliefs to uninformative historical observations, leading to limited compression of history spaces. Counterfactual Inference (CI), which estimates causal effects by single-variable intervention, is a promising way to adjust for confounding. However, it is computationally infeasible to directly apply the single-variable intervention to a huge number of historical observations. This paper proposes to perform CI on observation sub-spaces instead of single observations and develop a coarse-to-fine CI algorithm, called Tree-based History Counterfactual Inference (T-HCI), to reduce the number of interventions exponentially. We show that T-HCI is computationally feasible in practice and brings significant sample efficiency gains in various challenging partially-observable tasks, including Maze, BabyAI, and robot manipulation tasks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — history-based learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Haichuan Gao , Tianren Zhang , Zhile Yang , Yuqing Guo , Jinsheng Ren , Shangqi Guo , Feng Chen

Topics

Artificial Intelligence > Core AI > Causal Inference Machine Learning > Optimization & Theory > Statistical Learning Reinforcement Learning > Methods > Deep RL Artificial Intelligence > Core AI > Reinforcement Learning

Keywords

reinforcement learning sample efficiency partial observability partially observable causal effect counterfactual inference history-based learning

Download PDF

Related papers

A Model-Agnostic Heuristics for Selective Classification 2023

Tackling Safe and Efficient Multi-Agent Reinforcement Learning via Dynamic Shielding (Student Abstract) 2023

Head-Free Lightweight Semantic Segmentation with Linear Transformer 2023

Hierarchical ConViT with Attention-Based Relational Reasoner for Visual Analogical Reasoning 2023

Deep Spiking Neural Networks with High Representation Similarity Model Visual Pathways of Macaque and Mouse 2023