Towards Provable Learning of Polynomial Neural Networks Using Low-Rank Matrix Estimation

Mohammadreza Soltani; Chinmay Hegde

2018 AISTATS AISTATS 2018

Towards Provable Learning of Polynomial Neural Networks Using Low-Rank Matrix Estimation

Abstract

We study the problem of (provably) learning the weights of a two-layer neural network with quadratic activations. In particular, we focus on the under-parametrized regime where the number of neurons in the hidden layer is (much) smaller than the dimension of the input. Our approach uses a lifting trick, which enables us to borrow algorithmic ideas from low-rank matrix estimation. In this context, we propose two novel, non-convex training algorithms which do not need any extra tuning parameters other than the number of hidden neurons. We support our algorithms with rigorous theoretical analysis, and show that the proposed algorithms enjoy linear convergence, fast running time per iteration, and near-optimal sample complexity. Finally, we complement our theoretical results with several numerical experiments.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — quadratic activation

🐣 Hot Topic Early Bird — non-convex optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Reinforcement Learning, Robotics

Authors

Mohammadreza Soltani , Chinmay Hegde

Topics

Machine Learning > Optimization & Theory > Optimization Machine Learning > Optimization & Theory > Theory Deep Learning > Architectures > Neural Networks Machine Learning > Core Methods > Optimization Deep Learning > Optimization & Theory > Theory

Keywords

non-convex optimization sample complexity linear convergence quadratic activation low-rank matrix estimation two-layer neural network polynomial neural network non-convex training

Download PDF

Related papers

The Geometry of Random Features 2018

A Fast Algorithm for Separated Sparsity via Perturbed Lagrangians 2018

Regional Multi-Armed Bandits 2018

Group Invariance Principles for Causal Generative Models 2018

Stochastic Three-Composite Convex Minimization with a Linear Operator 2018