Temporal Difference Based Actor Critic Learning - Convergence and Neural Implementation

Dotan D. Castro; Dmitry Volkinshtein; Ron Meir

2008 NIPS NeurIPS 2008

Temporal Difference Based Actor Critic Learning - Convergence and Neural Implementation

Abstract

Actor-critic algorithms for reinforcement learning are achieving renewed popularity due to their good convergence properties in situations where other approaches often fail (e.g., when function approximation is involved). Interestingly, there is growing evidence that actor-critic approaches based on phasic dopamine signals play a key role in biological learning through the cortical and basal ganglia. We derive a temporal difference based actor critic learning algorithm, for which convergence can be proved without assuming separate time scales for the actor and the critic. The approach is demonstrated by applying it to networks of spiking neurons. The established relation between phasic dopamine and the temporal difference signal lends support to the biological relevance of such algorithms.

🧭 Keyword Pioneer — actor-critic algorithm

🐣 Hot Topic Early Bird — reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🌉 Interdisciplinary Bridge — Artificial Intelligence and Interdisciplinary and Reinforcement Learning

📈 Trend Setter — Reinforcement Learning

Authors

Dotan D. Castro , Dmitry Volkinshtein , Ron Meir

Topics

Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Interdisciplinary > Cognitive Science > Cognitive Modeling Artificial Intelligence > Core AI > Reinforcement Learning

Keywords

reinforcement learning temporal difference learning policy learning actor-critic temporal difference spiking neuron actor-critic algorithm spiking neural network neural network convergence proof actor-critic learning dopamine signal

Download PDF

Related papers

On the Efficient Minimization of Classification Calibrated Surrogates 2008

Hebbian Learning of Bayes Optimal Decisions 2008

Biasing Approximate Dynamic Programming with a Lower Discount Factor 2008

Counting Solution Clusters in Graph Coloring Problems Using Belief Propagation 2008

Domain Adaptation with Multiple Sources 2008