Finite Sample Convergence Rates of Zero-Order Stochastic Optimization Methods

Andre Wibisono; Martin J. Wainwright; Michael I. Jordan; John C. Duchi

2012 NIPS NeurIPS 2012

Finite Sample Convergence Rates of Zero-Order Stochastic Optimization Methods

Abstract

We consider derivative-free algorithms for stochastic optimization problems that use only noisy function values rather than gradients, analyzing their finite-sample convergence rates. We show that if pairs of function values are available, algorithms that use gradient estimates based on random perturbations suffer a factor of at most $\sqrt{\dim}$ in convergence rate over traditional stochastic gradient methods, where $\dim$ is the dimension of the problem. We complement our algorithmic development with information-theoretic lower bounds on the minimax convergence rate of such problems, which show that our bounds are sharp with respect to all problem-dependent quantities: they cannot be improved by more than constant factors.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — zero-order optimization

🐣 Hot Topic Early Bird — stochastic optimization

🐝 Cross-Pollinator — Artificial Intelligence, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Security & Privacy

Authors

Andre Wibisono , Martin J. Wainwright , Michael I. Jordan , John C. Duchi

Topics

Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Optimization Machine Learning > Optimization & Theory > Stochastic Processes Machine Learning > Optimization & Theory > Theory Mathematics & Optimization > Optimization > Stochastic Methods Machine Learning > Optimization & Theory > Stochastic Methods Mathematics & Optimization > Optimization > Optimization

Keywords

stochastic optimization gradient estimation zero-order optimization finite-sample analysis derivative-free methods derivative-free optimization finite-sample convergence noisy function values convergence rate zeroth-order optimization finite sample

Download PDF

Related papers

Kernel Hyperalignment 2012

Fused sparsity and robust estimation for linear models with unknown variance 2012

Slice sampling normalized kernel-weighted completely random measure mixture models 2012

Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization 2012

Matrix reconstruction with the local max norm 2012