2017 COLT COLT 2017

Nearly-tight VC-dimension bounds for piecewise linear neural networks

Abstract

We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting $W$ be the number of weights and $L$ be the number of layers, we prove that the VC-dimension is $O(W L \log(W))$, and provide examples with VC-dimension $Ω( W L \log(W/L) )$. This improves both the previously known upper bounds and lower bounds. In terms of the number $U$ of non-linear units, we prove a tight bound $Θ(W U)$ on the VC-dimension. All of these results generalize to arbitrary piecewise linear activation functions.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio