ParameterNet: Parameters Are All You Need for Large-scale Visual Pretraining of Mobile Networks

Kai Han; Yunhe Wang; Jianyuan Guo; Enhua Wu

2024 CVPR CVPR 2024

ParameterNet: Parameters Are All You Need for Large-scale Visual Pretraining of Mobile Networks

Abstract

The large-scale visual pretraining has significantly improve the performance of large vision models. However we observe the low FLOPs pitfall that the existing low-FLOPs models cannot benefit from large-scale pretraining. In this paper we introduce a novel design principle termed ParameterNet aimed at augmenting the number of parameters in large-scale visual pretraining models while minimizing the increase in FLOPs. We leverage dynamic convolutions to incorporate additional parameters into the networks with only a marginal rise in FLOPs. The ParameterNet approach allows low-FLOPs networks to take advantage of large-scale visual pretraining. Furthermore we extend the ParameterNet concept to the language domain to enhance inference results while preserving inference speed. Experiments on the large-scale ImageNet-22K have shown the superiority of our ParameterNet scheme. For example ParameterNet-600M can achieve higher accuracy than the widely-used Swin Transformer (81.6% vs. 80.9%) and has much lower FLOPs (0.6G vs. 4.5G). The code will be released at https://parameternet.github.io/.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kai Han , Yunhe Wang , Jianyuan Guo , Enhua Wu

Topics

Machine Learning > Application Areas > Efficient Computing Deep Learning > Techniques > Model Architecture Deep Learning > Techniques > Pretraining

Keywords

model pretraining neural network optimization parameter efficiency mobile network dynamic convolution

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024