Local Regularizer Improves Generalization

Yikai Zhang; Hui Qu; Dimitris Metaxas; Chao Chen

2020 AAAI AAAI 2020

Local Regularizer Improves Generalization

Abstract

Abstract Regularization plays an important role in generalization of deep learning. In this paper, we study the generalization power of an unbiased regularizor for training algorithms in deep learning. We focus on training methods called Locally Regularized Stochastic Gradient Descent (LRSGD). An LRSGD leverages a proximal type penalty in gradient descent steps to regularize SGD in training. We show that by carefully choosing relevant parameters, LRSGD generalizes better than SGD. Our thorough theoretical analysis is supported by experimental evidence. It advances our theoretical understanding of deep learning and provides new perspectives on designing training algorithms. The code is available at https://github.com/huiqu18/LRSGD.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — local regularizer

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yikai Zhang , Hui Qu , Dimitris Metaxas , Chao Chen

Topics

Machine Learning > Optimization & Theory > Optimization Machine Learning > Optimization & Theory > Statistical Learning Deep Learning > Techniques > Model Architecture Machine Learning > Learning Types > Deep Learning Deep Learning > Optimization & Theory > Neural Network Optimization Deep Learning > Optimization & Theory > Optimization Deep Learning > Optimization & Theory > Theory Deep Learning > Optimization & Theory > Regularization

Keywords

stochastic gradient descent deep learning neural network optimization regularization theory local regularizer proximal penalty

Download PDF

Related papers

Enhancing Pointer Network for Sentence Ordering with Pairwise Ordering Predictions 2020

CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning 2020

Neural Simile Recognition with Cyclic Multitask Learning and Local Attention 2020

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy 2020

Multi-Point Semantic Representation for Intent Classification 2020