A NON-PARAMETRIC REGRESSION VIEWPOINT : GENERALIZATION OF OVERPARAMETRIZED DEEP RELU NETWORK UNDER NOISY OBSERVATIONS

Namjoon Suh; Hyunouk Ko; Xiaoming Huo

2022 ICLR ICLR 2022

A NON-PARAMETRIC REGRESSION VIEWPOINT : GENERALIZATION OF OVERPARAMETRIZED DEEP RELU NETWORK UNDER NOISY OBSERVATIONS

Abstract

We study the generalization properties of the overparameterized deep neural network (DNN) with Rectified Linear Unit (ReLU) activations. Under the non-parametric regression framework, it is assumed that the ground-truth function is from a reproducing kernel Hilbert space (RKHS) induced by a neural tangent kernel (NTK) of ReLU DNN, and a dataset is given with the noises. Without a delicate adoption of early stopping, we prove that the overparametrized DNN trained by vanilla gradient descent does not recover the ground-truth function. It turns out that the estimated DNN's $L_{2}$ prediction error is bounded away from $0$. As a complement of the above result, we show that the $\ell_{2}$-regularized gradient descent enables the overparametrized DNN achieve the minimax optimal convergence rate of the $L_{2}$ prediction error, without early stopping. Notably, the rate we obtained is faster than $\mathcal{O}(n^{-1/2})$ known in the literature.

Authors

Namjoon Suh , Hyunouk Ko , Xiaoming Huo

Download PDF

Related papers

Understanding Intrinsic Robustness Using Label Uncertainty 2022

THOMAS: Trajectory Heatmap Output with learned Multi-Agent Sampling 2022

Meta Discovery: Learning to Discover Novel Classes given Very Limited Data 2022

How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective 2022

L0-Sparse Canonical Correlation Analysis 2022