Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data

Dachao Lin; Ruoyu Sun; Zhihua Zhang

2021 NIPS NeurIPS 2021

Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data

Abstract

In this paper, we study gradient methods for training deep linear neural networks with binary cross-entropy loss. In particular, we show global directional convergence guarantees from a polynomial rate to a linear rate for (deep) linear networks with spherically symmetric data distribution, which can be viewed as a specific zero-margin dataset. Our results do not require the assumptions in other works such as small initial loss, presumed convergence of weight direction, or overparameterization. We also characterize our findings in experiments.

🧭 Keyword Pioneer — directional convergence

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Dachao Lin , Ruoyu Sun , Zhihua Zhang

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Optimization & Theory > Theory

Keywords

convergence analysis gradient descent linear neural network directional convergence

Download PDF

Related papers

Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data 2021

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation 2021

Test-Time Personalization with a Transformer for Human Pose Estimation 2021

NTopo: Mesh-free Topology Optimization using Implicit Neural Representations 2021

Scalable Intervention Target Estimation in Linear Models 2021