Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration

Priyank Agrawal; Jinglin Chen; Nan Jiang

2021 AAAI AAAI 2021

Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration

Abstract

Abstract This paper studies regret minimization with randomized value functions in reinforcement learning. In tabular finite-horizon Markov Decision Processes, we introduce a clipping variant of one classical Thompson Sampling (TS)-like algorithm, randomized least-squares value iteration (RLSVI). Our $\tilde{\mathrm{O}}(H^2S\sqrt{AT})$ high-probability worst-case regret bound improves the previous sharpest worst-case regret bounds for RLSVI and matches the existing state-of-the-art worst-case TS-based regret bounds.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Priyank Agrawal , Jinglin Chen , Nan Jiang

Topics

Machine Learning > Optimization & Theory > Learning Theory Reinforcement Learning > Methods > Deep RL Machine Learning > Learning Types > Reinforcement Learning Machine Learning > Optimization & Theory > Stochastic Methods Machine Learning > Learning Types > Exploration

Keywords

reinforcement learning markov decision process value iteration thompson sampling regret bound worst-case regret randomized value function randomized least-squares value iteration tabular rl

Download PDF

Related papers

Contextual Conditional Reasoning 2021

Attention Beam: An Image Captioning Approach (Student Abstract) 2021

Movie Summarization via Sparse Graph Construction 2021

Text Analysis for Understanding Symptoms of Social Anxiety in Student Veterans 2021

ERNIE-ViL: Knowledge Enhanced Vision-Language Representations through Scene Graphs 2021