2023 ICCV ICCV 2023

ParCNetV2: Oversized Kernel with Enhanced Attention

Abstract

Transformers have shown great potential in various computer vision tasks. By borrowing design concepts from transformers, many studies revolutionized CNNs and showed remarkable results. This paper falls in this line of studies. Specifically, we propose a new convolutional neural network, ParCNetV2, that extends the research line of ParCNetV1 by bridging the gap between CNN and ViT. It introduces two key designs: 1) Oversized Convolution (OC) with twice the size of the input, and 2) Bifurcate Gate Unit (BGU) to ensure that the model is input adaptive. Fusing OC and BGU in a unified CNN, ParCNetV2 is capable of flexibly extracting global features like ViT, while maintaining lower latency and better accuracy. Extensive experiments demonstrate the superiority of our method over other convolutional neural networks and hybrid models that combine CNNs and transformers. The code is publicly available at https://github.com/XuRuihan/ParCNetV2.

🧭 Keyword Pioneer — oversized convolution
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio