Never Go Full Batch (in Stochastic Convex Optimization)

Idan Amir; Yair Carmon; Tomer Koren; Roi Livni

2021 NIPS NeurIPS 2021

Never Go Full Batch (in Stochastic Convex Optimization)

Abstract

We study the generalization performance of $\text{\emph{full-batch}}$ optimization algorithms for stochastic convex optimization: these are first-order methods that only access the exact gradient of the empirical risk (rather than gradients with respect to individual data points), that include a wide range of algorithms such as gradient descent, mirror descent, and their regularized and/or accelerated variants. We provide a new separation result showing that, while algorithms such as stochastic gradient descent can generalize and optimize the population risk to within $\epsilon$ after $O(1/\epsilon^2)$ iterations, full-batch methods either need at least $\Omega(1/\epsilon^4)$ iterations or exhibit a dimension-dependent sample complexity.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — full-batch optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Idan Amir , Yair Carmon , Tomer Koren , Roi Livni

Topics

Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Optimization Machine Learning > Optimization & Theory > Theory Mathematics & Optimization > Optimization > Stochastic Methods

Keywords

sample complexity gradient descent generalization error stochastic convex optimization full-batch optimization

Download PDF

Related papers

Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data 2021

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation 2021

Test-Time Personalization with a Transformer for Human Pose Estimation 2021

NTopo: Mesh-free Topology Optimization using Implicit Neural Representations 2021

Scalable Intervention Target Estimation in Linear Models 2021