UnionFormer: Unified-Learning Transformer with Multi-View Representation for Image Manipulation Detection and Localization

Shuaibo Li; Wei Ma; Jianwei Guo; Shibiao Xu; Benchong Li; XIAOPENG ZHANG

2024 CVPR CVPR 2024

UnionFormer: Unified-Learning Transformer with Multi-View Representation for Image Manipulation Detection and Localization

Abstract

We present UnionFormer a novel framework that integrates tampering clues across three views by unified learning for image manipulation detection and localization. Specifically we construct a BSFI-Net to extract tampering features from RGB and noise views achieving enhanced responsiveness to boundary artifacts while modulating spatial consistency at different scales. Additionally to explore the inconsistency between objects as a new view of clues we combine object consistency modeling with tampering detection and localization into a three-task unified learning process allowing them to promote and improve mutually. Therefore we acquire a unified manipulation discriminative representation under multi-scale supervision that consolidates information from three views. This integration facilitates highly effective concurrent detection and localization of tampering. We perform extensive experiments on diverse datasets and the results show that the proposed approach outperforms state-of-the-art methods in tampering detection and localization.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — unified learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Shuaibo Li , Wei Ma , Jianwei Guo , Shibiao Xu , Benchong Li , XIAOPENG ZHANG

Topics

Machine Learning > Learning Types > Self-Supervised Learning Deep Learning > Architectures > Transformers Computer Vision > Analysis > Anomaly Detection Machine Learning > Learning Types > Multi-Task Learning Computer Vision > Processing > Image Processing

Keywords

multi-view learning image manipulation detection multi-view representation tamper detection tampering localization unified learning

Download PDF

Related papers

DUSt3R: Geometric 3D Vision Made Easy 2024

Bezier Everywhere All at Once: Learning Drivable Lanes as Bezier Graphs 2024

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows 2024

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization 2024

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models 2024