Improving Image Restoration through Removing Degradations in Textual Representations

Jingbo Lin; Zhilu Zhang; Yuxiang Wei; Dongwei Ren; Dongsheng Jiang; Qi Tian; Wangmeng Zuo

2024 CVPR CVPR 2024

Improving Image Restoration through Removing Degradations in Textual Representations

Abstract

In this paper we introduce a new perspective for improving image restoration by removing degradation in the textual representations of a given degraded image. Intuitively restoration is much easier on text modality than image one. For example it can be easily conducted by removing degradation-related words while keeping the content-aware words. Hence we combine the advantages of images in detail description and ones of text in degradation removal to perform restoration. To address the cross-modal assistance we propose to map the degraded images into textual representations for removing the degradations and then convert the restored textual representations into a guidance image for assisting image restoration. In particular We ingeniously embed an image-to-text mapper and text restoration module into CLIP-equipped text-to-image models to generate the guidance. Then we adopt a simple coarse-to-fine approach to dynamically inject multi-scale information from guidance to image restoration networks. Extensive experiments are conducted on various image restoration tasks including deblurring dehazing deraining and denoising and all-in-one image restoration. The results showcase that our method outperforms state-of-the-art ones across all these tasks. The codes and models are available at https://github.com/mrluin/TextualDegRemoval.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jingbo Lin , Zhilu Zhang , Yuxiang Wei , Dongwei Ren , Dongsheng Jiang , Qi Tian , Wangmeng Zuo

Topics

Artificial Intelligence > Core AI > Multimodal Learning Computer Vision > Generation > Image Generation Computer Vision > Processing > Image Restoration Deep Learning > Learning Types > Multi-Modal Learning

Keywords

image restoration cross-modal learning image denoising image deblurring text-to-image model clip model textual representation

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024