Explicitly Imposing Constraints in Deep Networks via Conditional Gradients Gives Improved Generalization and Faster Convergence

Sathya N. Ravi; Tuan Dinh; Vishnu Suresh Lokhande; Vikas Singh

2019 AAAI AAAI 2019

Explicitly Imposing Constraints in Deep Networks via Conditional Gradients Gives Improved Generalization and Faster Convergence

Abstract

Abstract A number of results have recently demonstrated the benefits of incorporating various constraints when training deep architectures in vision and machine learning. The advantages range from guarantees for statistical generalization to better accuracy to compression. But support for general constraints within widely used libraries remains scarce and their broader deployment within many applications that can benefit from them remains under-explored. Part of the reason is that Stochastic gradient descent (SGD), the workhorse for training deep neural networks, does not natively deal with constraints with global scope very well. In this paper, we revisit a classical first order scheme from numerical optimization, Conditional Gradients (CG), that has, thus far had limited applicability in training deep models. We show via rigorous analysis how various constraints can be naturally handled by modifications of this algorithm. We provide convergence guarantees and show a suite of immediate benefits that are possible — from training ResNets with fewer layers but better accuracy simply by substituting in our version of CG to faster training of GANs with 50% fewer epochs in image inpainting applications to provably better generalization guarantees using efficiently implementable forms of recently proposed regularizers.

🚀 Conference Pioneer — AAAI 2019

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Mathematics & Optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sathya N. Ravi , Tuan Dinh , Vishnu Suresh Lokhande , Vikas Singh

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Optimization & Theory > Optimization Deep Learning > Techniques > Model Architecture Mathematics & Optimization > Optimization > Continuous Optimization Machine Learning > Learning Types > Deep Learning Deep Learning > Optimization & Theory > Optimization Deep Learning > Optimization & Theory > Efficient Computing

Keywords

model compression stochastic gradient descent conditional gradient deep neural network generalization guarantee neural network

Download PDF

Related papers

Cooperative Multimodal Approach to Depression Detection in Twitter 2019

Learning to Align Question and Answer Utterances in Customer Service Conversation with Recurrent Pointer Networks 2019

Community Detection in Social Networks Considering Topic Correlations 2019

Session-Based Recommendation with Graph Neural Networks 2019

Blameworthiness in Multi-Agent Settings 2019