Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins

Spencer Frei; Yuan Cao; Quanquan Gu

2021 ICML ICML 2021

Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins

Abstract

We analyze the properties of gradient descent on convex surrogates for the zero-one loss for the agnostic learning of halfspaces. We show that when a quantity we refer to as the \textit{soft margin} is well-behaved—a condition satisfied by log-concave isotropic distributions among others—minimizers of convex surrogates for the zero-one loss are approximate minimizers for the zero-one loss itself. As standard convex optimization arguments lead to efficient guarantees for minimizing convex surrogates of the zero-one loss, our methods allow for the first positive guarantees for the classification error of halfspaces learned by gradient descent using the binary cross-entropy or hinge loss in the presence of agnostic label noise.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Spencer Frei , Yuan Cao , Quanquan Gu

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Optimization Machine Learning > Learning Types > Supervised Learning Machine Learning > Learning Types > Classification

Keywords

label noise gradient descent agnostic learning soft margin halfspace learning classification error convex surrogate

Download PDF

Related papers

GRAND: Graph Neural Diffusion 2021

Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits 2021

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation 2021

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution 2021

Dataset Dynamics via Gradient Flows in Probability Space 2021