Balanced Linear Contextual Bandits

Maria Dimakopoulou; Zhengyuan Zhou; Susan Athey; guido imbens

2019 AAAI AAAI 2019

Balanced Linear Contextual Bandits

Abstract

Abstract Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult estimation problems along the path of learning. We develop algorithms for contextual bandits with linear payoffs that integrate balancing methods from the causal inference literature in their estimation to make it less prone to problems of estimation bias. We provide the first regret bound analyses for linear contextual bandits with balancing and show that our algorithms match the state of the art theoretical guarantees. We demonstrate the strong practical advantage of balanced contextual bandits on a large number of supervised learning datasets and on a synthetic example that simulates model misspecification and prejudice in the initial training data.

🚀 Conference Pioneer — AAAI 2019

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — balancing method

🐣 Hot Topic Early Bird — linear bandit

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Maria Dimakopoulou , Zhengyuan Zhou , Susan Athey , guido imbens

Topics

Artificial Intelligence > Core AI > Agent Systems Machine Learning > Core Methods > Regression Machine Learning > Optimization & Theory > Optimization Machine Learning > Learning Types > Online Learning Machine Learning > Learning Types > Exploration-Exploitation

Keywords

causal inference model misspecification regret bound contextual bandit linear bandit heterogeneous treatment effect linear payoff balancing method

Download PDF

Related papers

Cooperative Multimodal Approach to Depression Detection in Twitter 2019

Learning to Align Question and Answer Utterances in Customer Service Conversation with Recurrent Pointer Networks 2019

Community Detection in Social Networks Considering Topic Correlations 2019

Session-Based Recommendation with Graph Neural Networks 2019

Blameworthiness in Multi-Agent Settings 2019