MixDiff: Mixing Natural and Synthetic Images for Robust Self-Supervised Representations

Reza Akbarian Bafghi; Nidhin Harilal; Maziar Raissi; Claire Monteleoni

2025 WACV WACV 2025

MixDiff: Mixing Natural and Synthetic Images for Robust Self-Supervised Representations

Abstract

This paper introduces MixDiff a new self-supervised learning (SSL) pre-training framework that combines real and synthetic images. Unlike traditional SSL methods that predominantly use real images MixDiff uses a variant of Stable Diffusion to replace an augmented instance of a real image facilitating the learning of cross real-synthetic image representations. Our key insight is that while models trained solely on synthetic images underperform combining real and synthetic data leads to more robust and adaptable representations. Experiments show MixDiff enhances SimCLR BarlowTwins and DINO across various robustness datasets and domain transfer tasks boosting SimCLR's ImageNet-1K accuracy by 4.56%. Our framework also demonstrates comparable performance without needing any augmentations a surprising finding in SSL where augmentations are typically crucial. Furthermore MixDiff achieves similar results to SimCLR while requiring less real data highlighting its efficiency in representation learning

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Reza Akbarian Bafghi , Nidhin Harilal , Maziar Raissi , Claire Monteleoni

Topics

Machine Learning > Learning Types > Self-Supervised Learning Machine Learning > Application Areas > Domain Adaptation Deep Learning > Techniques > Pretraining Deep Learning > Learning Types > Self-Supervised Learning Computer Vision > Core AI > Computer Vision Deep Learning > Learning Types > Representation Learning

Keywords

representation learning image classification domain adaptation self-supervised learning synthetic datum stable diffusion image augmentation

Download PDF

Related papers

Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration 2025

ELMGS: Enhancing Memory and Computation Scalability through Compression for 3D Gaussian Splatting 2025

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation 2025

Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach 2025

Disentangling Spatio-Temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video 2025