Do Subsampled Newton Methods Work for High-Dimensional Data?

Xiang Li; Shusen Wang; Zhihua Zhang

2020 AAAI AAAI 2020

Do Subsampled Newton Methods Work for High-Dimensional Data?

Abstract

Abstract Subsampled Newton methods approximate Hessian matrices through subsampling techniques to alleviate the per-iteration cost. Previous results require Ω (d) samples to approximate Hessians, where d is the dimension of data points, making it less practical for high-dimensional data. The situation is deteriorated when d is comparably as large as the number of data points n, which requires to take the whole dataset into account, making subsampling not useful. This paper theoretically justifies the effectiveness of subsampled Newton methods on strongly convex empirical risk minimization with high dimensional data. Specifically, we provably require only Θ˜(deffγ) samples for approximating the Hessian matrices, where deffγ is the γ-ridge leverage and can be much smaller than d as long as nγ ≫ 1. Our theories work for three types of Newton methods: subsampled Netwon, distributed Newton, and proximal Newton.

❓ The Questioner

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — subsampled newton method

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xiang Li , Shusen Wang , Zhihua Zhang

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Mathematics & Optimization > Mathematics > Numerical Analysis Mathematics & Optimization > Optimization > Continuous Optimization

Keywords

convex optimization empirical risk minimization hessian approximation high-dimensional optimization subsampled newton method ridge leverage

Download PDF

Related papers

Enhancing Pointer Network for Sentence Ordering with Pairwise Ordering Predictions 2020

CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning 2020

Neural Simile Recognition with Cyclic Multitask Learning and Local Attention 2020

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy 2020

Multi-Point Semantic Representation for Intent Classification 2020