A Primal-dual Perspective for Distributed TD-learning

Han Dong Lim; Donghwan Lee

2025 IJCAI IJCAI 2025

A Primal-dual Perspective for Distributed TD-learning

Abstract

The goal of this paper is to investigate distributed temporal difference (TD) learning for a networked multi-agent Markov decision process. The proposed approach is based on distributed optimization algorithms, which can be interpreted as primal-dual ordinary differential equation (ODE) dynamics subject to null-space constraints. Based on the exponential convergence behavior of the primal-dual ODE dynamics subject to null-space constraints, we examine the behavior of the final iterate in various distributed TD-learning scenarios, considering both constant and diminishing step-sizes and incorporating both i.i.d. and Markovian observation models. Unlike existing methods, the proposed algorithm does not require the assumption that the underlying communication network structure is characterized by a doubly stochastic matrix.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization and Reinforcement Learning

🧭 Keyword Pioneer — primal-dual dynamics

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Han Dong Lim , Donghwan Lee

Topics

Machine Learning > Optimization & Theory > Distributed Learning Reinforcement Learning > Methods > Deep RL Mathematics & Optimization > Optimization > Stochastic Methods

Keywords

temporal difference learning distributed optimization markov decision process stochastic process multi-agent system primal-dual dynamics

Download PDF

Related papers

Learning Advanced Self-Attention for Linear Transformers in the Singular Value Domain 2025

Responsibility Anticipation and Attribution in LTLf 2025

Argument-based Multi-Issue Negotiation 2025

Online Resource Sharing: Better Robust Guarantees via Randomized Strategies 2025

Equitable Mechanism Design for Facility Location 2025