The Parallel Knowledge Gradient Method for Batch Bayesian Optimization

Jian Wu; Peter Frazier

2016 NIPS NeurIPS 2016

The Parallel Knowledge Gradient Method for Batch Bayesian Optimization

Abstract

In many applications of black-box optimization, one can evaluate multiple points simultaneously, e.g. when evaluating the performances of several different neural network architectures in a parallel computing environment. In this paper, we develop a novel batch Bayesian optimization algorithm --- the parallel knowledge gradient method. By construction, this method provides the one-step Bayes optimal batch of points to sample. We provide an efficient strategy for computing this Bayes-optimal batch of points, and we demonstrate that the parallel knowledge gradient method finds global optima significantly faster than previous batch Bayesian optimization algorithms on both synthetic test functions and when tuning hyperparameters of practical machine learning algorithms, especially when function evaluations are noisy.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

📈 Trend Setter — Meta-Learning

🐣 Hot Topic Early Bird — hyperparameter tuning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jian Wu , Peter Frazier

Topics

Artificial Intelligence > Learning Paradigms > Meta-Learning Machine Learning > Optimization & Theory > Bayesian Inference Machine Learning > Optimization & Theory > Optimization Machine Learning > Learning Types > Bayesian Optimization Machine Learning > Learning Types > Hyperparameter Optimization

Keywords

black-box optimization parallel optimization bayesian optimization hyperparameter tuning batch optimization knowledge gradient

Download PDF

Related papers

Bayesian Intermittent Demand Forecasting for Large Inventories 2016

Dynamic Network Surgery for Efficient DNNs 2016

Beyond Exchangeability: The Chinese Voting Process 2016

Safe and Efficient Off-Policy Reinforcement Learning 2016

Tagger: Deep Unsupervised Perceptual Grouping 2016