Dynamic Forward and Backward Sparse Training
 (DFBST): Accelerated Deep Learning through
 Completely Sparse Training Schedule

Tejas Pote; Muhammad Athar Ganaie; Atif Hassan; Swanand Khare

2022 ACML ACML 2022

Dynamic Forward and Backward Sparse Training (DFBST): Accelerated Deep Learning through Completely Sparse Training Schedule

Abstract

Neural network sparsification has received a lot of attention in recent years. A number of dynamic sparse training methods have been developed that achieve significant sparsity levels during training, ensuring comparable performance to their dense counterparts. However, most of these methods update all the model parameters using dense gradients. To this end, gradient sparsification is achieved either by non-dynamic (fixed) schedule or computationally expensive dynamic pruning schedule. To alleviate these drawbacks, we propose Dynamic Forward and Backward Sparse Training (DFBST), an algorithm which dynamically sparsifies both the forward and backward passes using trainable masks, leading to a completely sparse training schedule. In contrast to existing sparse training methods, we propose separate learning for forward as well as backward masks. Our approach achieves state of the art performance in terms of both accuracy and sparsity compared to existing dynamic pruning algorithms on benchmark datasets, namely MNIST, CIFAR-10 and CIFAR-100.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — trainable mask

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tejas Pote , Muhammad Athar Ganaie , Atif Hassan , Swanand Khare

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Application Areas > Efficient Computing Deep Learning > Architectures > Neural Networks

Keywords

model compression sparse training gradient sparsification neural network dynamic pruning trainable mask

Download PDF

Related papers

When to Classify Events in Open Times Series? 2022

Noisy Riemannian Gradient Descent for Eigenvalue Computation with Application to Inexact Stochastic Recursive Gradient Algorithm 2022

A Self-improving Skin Lesions Diagnosis Framework Via Pseudo-labeling and Self-distillation 2022

Towards Data-Free Domain Generalization 2022

SNAIL: Semi-Separated Uncertainty Adversarial Learning for Universal Domain Adaptation 2022