Finite-time theory for momentum Q-learning

Weng Bowen; Xiong Huaqing; Zhao Lin; Liang Yingbin; Zhang Wei

2021 UAI UAI 2021

Finite-time theory for momentum Q-learning

Abstract

Existing studies indicate that momentum ideas in conventional optimization can be used to improve the performance of Q-learning algorithms. However, the finite-time analysis for momentum-based Q-learning algorithms is only available for the tabular case without function approximation. This paper analyzes a class of momentum-based Q-learning algorithms with finite-time convergence guarantee. Specifically, we propose the MomentumQ algorithm, which integrates the Nesterov’s and Polyak’s momentum schemes, and generalizes the existing momentum-based Q-learning algorithms. For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximation under Markovian sampling. In particular, we characterize a finite-time convergence rate which is provably faster than the vanilla Q-learning. This is the first finite-time analysis for momentum-based Q-learning algorithms with function approximation. For the tabular case under synchronous sampling, we also obtain a finite-time convergence rate that is slightly better than the SpeedyQ (Azar et al., NIPS 2011). Finally, we demonstrate through various experiments that the proposed MomentumQ outperforms other momentum-based Q-learning algorithms.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🧭 Keyword Pioneer — momentum q-learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Weng Bowen , Xiong Huaqing , Zhao Lin , Liang Yingbin , Zhang Wei

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Reinforcement Learning > Methods > Deep RL

Keywords

reinforcement learning function approximation finite-time analysis nesterov momentum momentum q-learning

Download PDF

Related papers

Efficient greedy coordinate descent via variable partitioning 2021

Multi-output Gaussian Processes for uncertainty-aware recommender systems 2021

Constrained differentially private federated learning for low-bandwidth devices 2021

Matrix games with bandit feedback 2021

A weaker faithfulness assumption based on triple interactions 2021