2018
COLT
COLT 2018
Adaptivity to Smoothness in X-armed bandits
Abstract
We study the stochastic continuum-armed bandit problem from the angle of adaptivity to \emph{unknown regularity} of the reward function $f$. We prove that there exists no strategy for the cumulative regret that adapts optimally to the \emph{smoothness} of $f$. We show however that such minimax optimal adaptive strategies exist if the learner is given \emph{extra-information} about $f$. Finally, we complement our positive results with matching lower bounds.
🌉
Interdisciplinary Bridge
— Machine Learning and Mathematics & Optimization
🧭
Keyword Pioneer
— x-armed bandit
🐣
Hot Topic Early Bird
— minimax optimal
🐝
Cross-Pollinator
— Artificial Intelligence, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Security & Privacy