Learning a Single Neuron with Gradient Methods

Gilad Yehudai; Shamir Ohad

2020 COLT COLT 2020

Learning a Single Neuron with Gradient Methods

Abstract

We consider the fundamental problem of learning a single neuron $\mathbf{x}\mapsto \sigma(\mathbf{w}^\top\mathbf{x})$ in a realizable setting, using standard gradient methods with random initialization, and under general families of input distributions and activations. On the one hand, we show that some assumptions on both the distribution and the activation function are necessary. On the other hand, we prove positive guarantees under mild assumptions, which go significantly beyond those studied in the literature so far. We also point out and study the challenges in further strengthening and generalizing our results.

🧭 Keyword Pioneer — realizable setting

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Gilad Yehudai , Shamir Ohad

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Optimization & Theory > Optimization

Keywords

gradient descent activation function single neuron random initialization realizable setting

Download PDF

Related papers

Open Problem: Average-Case Hardness of Hypergraphic Planted Clique Detection 2020

Highly smooth minimization of non-smooth problems 2020

Closure Properties for Private Classification and Online Prediction 2020

Efficient, Noise-Tolerant, and Private Learning via Boosting 2020

Domain Compression and its Application to Randomness-Optimal Distributed Goodness-of-Fit 2020