The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond

Jiin Woo; Gauri Joshi; Yuejie Chi

2025 JMLR JMLR 2025

The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond

Abstract

In this paper, we consider federated Q-learning, which aims to learn an optimal Q-function by periodically aggregating local Q-estimates trained on local data alone. Focusing on infinite-horizon tabular Markov decision processes, we provide sample complexity guarantees for both the synchronous and asynchronous variants of federated Q-learning, which exhibit a linear speedup with respect to the number of agents and near-optimal dependencies on other salient problem parameters. In the asynchronous setting, existing analyses of federated Q-learning, which adopt an equally weighted averaging of local Q-estimates, require that every agent covers the entire state-action space. In contrast, our improved sample complexity scales inverse proportionally to the minimum entry of the average stationary state-action occupancy distribution of all agents, thus only requiring the agents to collectively cover the entire state-action space, unveiling the blessing of heterogeneity. However, its sample complexity still suffers when the local trajectories are highly heterogeneous. In response, we propose a novel federated Q-learning algorithm with importance averaging, giving larger weights to more frequently visited state-action pairs, which achieves a robust linear speedup as if all trajectories are centrally processed, regardless of the heterogeneity of local behavior policies. [abs] [ pdf ][ bib ] © JMLR 2025. (edit, beta)

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jiin Woo , Gauri Joshi , Yuejie Chi

Topics

Artificial Intelligence > Learning Paradigms > Federated Learning Reinforcement Learning > Methods > Multi-Agent Systems

Keywords

sample complexity linear speedup federated q-learning tabular markov decision process importance averaging

Download PDF

Related papers

On the Natural Gradient of the Evidence Lower Bound 2025

Four Axiomatic Characterizations of the Integrated Gradients Attribution Method 2025

Extending Temperature Scaling with Homogenizing Maps 2025

Ontolearn---A Framework for Large-scale OWL Class Expression Learning in Python 2025

An Axiomatic Definition of Hierarchical Clustering 2025