2005
JMLR
JMLR 2005
A Generalization Error for Q-Learning
Abstract
Planning problems that involve learning a policy from a single training set of finite horizon trajectories arise in both social science and medical fields. We consider Q-learning with function approximation for this setting and derive an upper bound on the generalization error. This upper bound is in terms of quantities minimized by a Q-learning algorithm, the complexity of the approximation space and an approximation term due to the mismatch between Q-learning and the goal of learning a policy that maximizes the value function. [abs] [ pdf ][ bib ] © JMLR 2005. (edit, beta)
🌱
Topic Pioneer
— Deep RL
🌉
Interdisciplinary Bridge
— Machine Learning and Reinforcement Learning
📈
Trend Setter
— Deep RL
🧭
Keyword Pioneer
— learning bound
🐣
Hot Topic Early Bird
— function approximation
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio