Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

Zebang Shen; Hui Qian; Tongzhou Mu; Chao Zhang

2017 IJCAI IJCAI 2017

Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

Abstract

Nowadays, algorithms with fast convergence, small memory footprints, and low per-iteration complexity are particularly favorable for artificial intelligence applications. In this paper, we propose a doubly stochastic algorithm with a novel accelerating multi-momentum technique to solve large scale empirical risk minimization problem for learning tasks. While enjoying a provably superior convergence rate, in each iteration, such algorithm only accesses a mini batch of samples and meanwhile updates a small block of variable coordinates, which substantially reduces the amount of memory reference when both the massive sample size and ultra-high dimensionality are involved. Specifically, to obtain an ε-accurate solution, our algorithm requires only O(log(1/ε)/sqrt(ε)) overall computation for the general convex case and O((n+sqrt{nκ})log(1/ε)) for the strongly convex case. Empirical studies on huge scale datasets are conducted to illustrate the efficiency of our method in practice.

🧭 Keyword Pioneer — accelerated method

🐣 Hot Topic Early Bird — stochastic gradient descent

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Mathematics & Optimization