ReEdit: Multimodal Exemplar-Based Image Editing

Ashutosh Srivastava; Tarun Ram Menta; Abhinav Java; Avadhoot Gorakh Jadhav; Silky Singh; Surgan Jandial; Balaji Krishnamurthy

2025 WACV WACV 2025

ReEdit: Multimodal Exemplar-Based Image Editing

Abstract

Modern Text-to-Image (T2I) Diffusion models have revolutionized image editing by enabling the generation of high-quality photorealistic images. While the de-facto method for performing edits with T2I models is through text instructions this approach is non-trivial due to the complex many-to-many mapping between natural language and images. In this work we address exemplar-based image editing - the task of transferring an edit from an exemplar pair to a content image(s). We propose ReEdit a modular and efficient end-to-end framework that captures edits in both text and image modalities while ensuring the fidelity of the edited image. We validate the effectiveness of ReEdit through extensive comparisons with state-of-the-art baselines and sensitivity analyses of key design choices. Our results demonstrate that ReEdit consistently outperforms contemporary approaches both qualitatively and quantitatively. Additionally ReEdit boasts high practical applicability as it does not require any task-specific optimization and is 4 times faster than the existing state-of-the-art. The code and data for our work is available at https://reedit-diffusion.github.io/.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ashutosh Srivastava , Tarun Ram Menta , Abhinav Java , Avadhoot Gorakh Jadhav , Silky Singh , Surgan Jandial , Balaji Krishnamurthy

Topics

Deep Learning > Models > Diffusion Models Computer Vision > Generation > Image Generation Computer Vision > Processing > Image Editing Machine Learning > Learning Types > Multi-Modal Learning

Keywords

image generation multimodal learning image editing generative model text-to-image diffusion exemplar-based learning

Download PDF

Related papers

Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration 2025

ELMGS: Enhancing Memory and Computation Scalability through Compression for 3D Gaussian Splatting 2025

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation 2025

Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach 2025

Disentangling Spatio-Temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video 2025