Adaptive Monte Carlo via Bandit Allocation

James Neufeld; András György; Csaba Szepesvári; Dale Schuurmans

2014 ICML ICML 2014

Adaptive Monte Carlo via Bandit Allocation

Abstract

We consider the problem of sequentially choosing between a set of unbiased Monte Carlo estimators to minimize the mean-squared-error (MSE) of a final combined estimate. By reducing this task to a stochastic multi-armed bandit problem, we show that well developed allocation strategies can be used to achieve an MSE that approaches that of the best estimator chosen in retrospect. We then extend these developments to a scenario where alternative estimators have different, possibly stochastic, costs. The outcome is a new set of adaptive Monte Carlo strategies that provide stronger guarantees than previous approaches while offering practical advantages.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — sampling strategy

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

James Neufeld , András György , Csaba Szepesvári , Dale Schuurmans

Topics

Machine Learning > Optimization & Theory > Bayesian Inference Mathematics & Optimization > Mathematics > Probability Machine Learning > Optimization & Theory > Stochastic Methods Machine Learning > Learning Types > Multi-Armed Bandits

Keywords

stochastic optimization sampling strategy adaptive sampling monte carlo estimation mean squared error multi-armed bandit monte carlo method mean-squared error estimator allocation

Download PDF

Related papers

Demystifying Information-Theoretic Clustering 2014

Margins, Kernels and Non-linear Smoothed Perceptrons 2014

Large-Margin Metric Learning for Constrained Partitioning Problems 2014

Efficient Approximation of Cross-Validation for Kernel Methods using Bouligand Influence Function 2014

Generalized Exponential Concentration Inequality for Renyi Divergence Estimation 2014