A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces

Omar Darwiche Domingues; Pierre Menard; Matteo Pirotta; Emilie Kaufmann; Michal Valko

2021 AISTATS AISTATS 2021

A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces

Abstract

In this work, we propose KeRNS: an algorithm for episodic reinforcement learning in non-stationary Markov Decision Processes (MDPs) whose state-action set is endowed with a metric. Using a non-parametric model of the MDP built with time-dependent kernels, we prove a regret bound that scales with the covering dimension of the state-action space and the total variation of the MDP with time, which quantifies its level of non-stationarity. Our method generalizes previous approaches based on sliding windows and exponential discounting used to handle changing environments. We further propose a practical implementation of KeRNS, we analyze its regret and validate it experimentally.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization and Reinforcement Learning

🧭 Keyword Pioneer — non-stationary reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Omar Darwiche Domingues , Pierre Menard , Matteo Pirotta , Emilie Kaufmann , Michal Valko

Topics

Machine Learning > Optimization & Theory > Stochastic Processes Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Machine Learning > Learning Types > Reinforcement Learning Mathematics & Optimization > Optimization > Optimization

Keywords

reinforcement learning markov decision process regret bound non-parametric model metric space non-stationary mdp kernel methods non-stationary reinforcement learning

Download PDF

Related papers

Linear Regression Games: Convergence Guarantees to Approximate Out-of-Distribution Solutions 2021

Semi-Supervised Learning with Meta-Gradient 2021

Accelerating Metropolis-Hastings with Lightweight Inference Compilation 2021

When MAML Can Adapt Fast and How to Assist When It Cannot 2021

On the convergence of the Metropolis algorithm with fixed-order updates for multivariate binary probability distributions 2021