State-Wise Adaptive Discounting from Experience (SADE): A Novel Discounting Scheme for Reinforcement Learning (Student Abstract)

Milan Zinzuvadiya; Vahid Behzadan

2021 AAAI AAAI 2021

State-Wise Adaptive Discounting from Experience (SADE): A Novel Discounting Scheme for Reinforcement Learning (Student Abstract)

Abstract

Abstract In Markov Decision Process (MDP) models of sequential decision-making, it is common practice to account for temporal discounting by incorporating a constant discount factor. While the effectiveness of fixed-rate discounting in various Reinforcement Learning (RL) settings is well-established, the efficiency of this scheme has been questioned in recent studies. Another notable shortcoming of fixed-rate discounting stems from abstracting away the experiential information of the agent, which is shown to be a significant component of delay discounting in human cognition. To address this issue, we propose State-wise Adaptive Discounting from Experience (SADE) as a novel adaptive discounting scheme for RL agents. SADE leverages the experiential observations of state values in episodic trajectories to iteratively adjust state-specific discount rates. We report experimental evaluations of SADE in Q-learning agents, which demonstrate significant enhancement of sample complexity and convergence rate compared to fixed-rate discounting.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — adaptive discounting

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Milan Zinzuvadiya , Vahid Behzadan

Topics

Machine Learning > Optimization & Theory > Optimization Reinforcement Learning > Methods > Policy Learning Machine Learning > Learning Types > Reinforcement Learning Deep Learning > Learning Types > Reinforcement Learning

Keywords

reinforcement learning sample complexity markov decision process discount factor adaptive discounting state value state-wise discount

Download PDF

Related papers

Contextual Conditional Reasoning 2021

Attention Beam: An Image Captioning Approach (Student Abstract) 2021

Movie Summarization via Sparse Graph Construction 2021

Text Analysis for Understanding Symptoms of Social Anxiety in Student Veterans 2021

ERNIE-ViL: Knowledge Enhanced Vision-Language Representations through Scene Graphs 2021