Budgeted Bandit Problems with Continuous Random Costs

Yingce Xia; Wenkui Ding; Xu-dong Zhang; Nenghai Yu; Tao Qin

2015 ACML ACML 2015

Budgeted Bandit Problems with Continuous Random Costs

Abstract

We study the budgeted bandit problem, where each arm is associated with both a reward and a cost. In a budgeted bandit problem, the objective is to design an arm pulling algorithm in order to maximize the total reward before the budget runs out. In this work, we study both multi-armed bandits and linear bandits, and focus on the setting with continuous random costs. We propose an upper confidence bound based algorithm for multi-armed bandits and a confidence ball based algorithm for linear bandits, and prove logarithmic regret bounds for both algorithms. We conduct simulations on the proposed algorithms, which verify the effectiveness of our proposed algorithms.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🐣 Hot Topic Early Bird — multi-armed bandit

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yingce Xia , Wenkui Ding , Xu-dong Zhang , Nenghai Yu , Tao Qin

Topics

Mathematics & Optimization > Optimization > Online Algorithms Machine Learning > Learning Types > Online Learning Machine Learning > Learning Types > Multi-Armed Bandits

Keywords

online learning multi-armed bandit regret bound linear bandit budget constraint

Download PDF

Related papers

Continuous Target Shift Adaptation in Supervised Learning 2015

Surrogate regret bounds for generalized classification performance metrics 2015

Statistical Unfolded Logic Learning 2015

Integration of Single-view Graphs with Diffusion of Tensor Product Graphs for Multi-view Spectral Clustering 2015

Class-prior Estimation for Learning from Positive and Unlabeled Data 2015