Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably

Tianyi Liu; Yan Li; Enlu Zhou; Tuo Zhao

2022 AISTATS AISTATS 2022

Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably

Abstract

We investigate the role of noise in optimization algorithms for learning over-parameterized models. Specifically, we consider the recovery of a rank one matrix $Y^*\in R^{d\times d}$ from a noisy observation $Y$ using an over-parameterization model. Specifically, we parameterize the rank one matrix $Y^*$ by $XX^\top$, where $X\in R^{d\times d}$. We then show that under mild conditions, the estimator, obtained by the randomly perturbed gradient descent algorithm using the square loss function, attains a mean square error of $O(\sigma^2/d)$, where $\sigma^2$ is the variance of the observational noise. In contrast, the estimator obtained by gradient descent without random perturbation only attains a mean square error of $O(\sigma^2)$. Our result partially justifies the implicit regularization effect of noise when learning over-parameterized models, and provides new understanding of training over-parameterized neural networks.

🧭 Keyword Pioneer — noise regularization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

Authors

Tianyi Liu , Yan Li , Enlu Zhou , Tuo Zhao

Topics

Machine Learning > Learning Types > Weakly Supervised Learning Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Optimization Machine Learning > Core Methods > Optimization Deep Learning > Optimization & Theory > Optimization

Keywords

gradient descent matrix recovery implicit regularization noise regularization over-parameterized neural network neural network

Download PDF

Related papers

Exploring Image Regions Not Well Encoded by an INN 2022

On Linear Model with Markov Signal Priors 2022

Probabilistic Numerical Method of Lines for Time-Dependent Partial Differential Equations 2022

On Distributionally Robust Optimization and Data Rebalancing 2022

Common Failure Modes of Subcluster-based Sampling in Dirichlet Process Gaussian Mixture Models - and a Deep-learning Solution 2022