Pre-Trained Image Processing Transformer

Hanting Chen; Yunhe Wang; Tianyu Guo; Chang Xu; Yiping Deng; Zhenhua Liu; Siwei Ma; Chunjing XU; Chao Xu; Wen Gao

2021 CVPR CVPR 2021

Pre-Trained Image Processing Transformer

Abstract

As the computing power of modern hardware is increasing strongly, pre-trained deep learning models (e.g., BERT, GPT-3) learned on large-scale datasets have shown their effectiveness over conventional methods. The big progress is mainly contributed to the representation ability of transformer and its variant architectures. In this paper, we study the low-level computer vision task (e.g., denoising, super-resolution and deraining) and develop a new pre-trained model, namely, image processing transformer (IPT). To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs. The IPT model is trained on these images with multi-heads and multi-tails. In addition, the constructive learning is introduced for well adapting to different image processing tasks. The pre-trained model can therefore efficiently employed on desired task after fine-tuning. With only one pre-trained model, IPT outperforms the current state-of-the-art methods on various low-level benchmarks. Code is available at https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/IPT

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning

🧭 Keyword Pioneer — image processing transformer

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hanting Chen , Yunhe Wang , Tianyu Guo , Chang Xu , Yiping Deng , Zhenhua Liu , Siwei Ma , Chunjing XU , Chao Xu , Wen Gao

Topics

Deep Learning > Architectures > Transformers Deep Learning > Techniques > Pretraining Computer Vision > Processing > Image Restoration Deep Learning > Models > Transformers Deep Learning > Learning Types > Transfer Learning

Keywords

transfer learning image restoration image denoising image deraining image processing transformer low-level computer vision

Download PDF

Related papers

Learning To Reconstruct High Speed and High Dynamic Range Videos From Events 2021

DeFLOCNet: Deep Image Editing via Flexible Low-Level Controls 2021

Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs 2021

Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization 2021

Pose-Guided Human Animation From a Single Image in the Wild 2021