Smooth Maximum Unit: Smooth Activation Function for Deep Networks Using Smoothing Maximum Technique

Koushik Biswas; Sandeep Kumar; Shilpak Banerjee; Ashish Kumar Pandey

2022 CVPR CVPR 2022

Smooth Maximum Unit: Smooth Activation Function for Deep Networks Using Smoothing Maximum Technique

Abstract

Deep learning researchers have a keen interest in proposing new novel activation functions that can boost neural network performance. A good choice of activation function can have a significant effect on improving network performance and training dynamics. Rectified Linear Unit (ReLU) is a popular hand-designed activation function and is the most common choice in the deep learning community due to its simplicity though ReLU has some drawbacks. In this paper, we have proposed two new novel activation functions based on approximation of the maximum function, and we call these functions Smooth Maximum Unit (SMU and SMU-1). We show that SMU and SMU-1 can smoothly approximate ReLU, Leaky ReLU, or more general Maxout family, and GELU is a particular case of SMU. Replacing ReLU by SMU, Top-1 classification accuracy improves by 6.22%, 3.39%, 3.51%, and 3.08% on the CIFAR100 dataset with ShuffleNet V2, PreActResNet-50, ResNet-50, and SeNet-50 models respectively. Also, our experimental evaluation shows that SMU and SMU-1 improve network performance in a variety of deep learning tasks like image classification, object detection, semantic segmentation, and machine translation compared to widely used activation functions.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — smooth maximum

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Koushik Biswas , Sandeep Kumar , Shilpak Banerjee , Ashish Kumar Pandey

Topics

Machine Learning > Optimization & Theory > Neural Network Optimization Deep Learning > Architectures > Neural Networks Deep Learning > Techniques > Model Architecture Deep Learning > Optimization & Theory > Neural Network Optimization Deep Learning > Optimization & Theory > Optimization

Keywords

image classification object detection deep learning neural network optimization activation function smoothing technique neural network smooth approximation smooth maximum maxout family

Download PDF

Related papers

UniCoRN: A Unified Conditional Image Repainting Network 2022

Why Discard if You Can Recycle?: A Recycling Max Pooling Module for 3D Point Cloud Analysis 2022

All-in-One Image Restoration for Unknown Corruption 2022

Stability-Driven Contact Reconstruction From Monocular Color Images 2022

Forecasting Characteristic 3D Poses of Human Actions 2022