Disentangling Subject-Irrelevant Elements in Personalized Text-to-Image Diffusion via Filtered Self-Distillation

Seunghwan Choi; Jooyeol Yun; Jeonghoon Park; Jaegul Choo

2025 WACV WACV 2025

Disentangling Subject-Irrelevant Elements in Personalized Text-to-Image Diffusion via Filtered Self-Distillation

Abstract

Recent research has unveiled the development of customizing large-scale text-to-image models. These models bind a unique subject desired by a user to a specific token using the token to generate the subject in various contexts. However models from previous studies also bind elements unrelated to the subject's identity such as common backgrounds or poses in the reference images. This often leads to conflicts between the token and the context of text prompts during inference causing the model to fail to generate both the subject and the prompted context. In this work we approach this issue from a data scarcity perspective and propose to augment the number of reference images through a novel self-distillation framework. Our framework selects high-quality samples from images generated by a teacher model and uses them in student training. Our framework can be applied to any models that suffer from the conflicts and we demonstrate that our framework most effectively resolves the issue through comprehensive evaluations.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — subject disentanglement

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Machine Learning, Mathematics & Optimization, Natural Language Processing, Security & Privacy, Speech & Audio

Authors

Seunghwan Choi , Jooyeol Yun , Jeonghoon Park , Jaegul Choo

Topics

Machine Learning > Learning Types > Self-Supervised Learning Deep Learning > Models > Diffusion Models

Keywords

personalized generation text-to-image diffusion subject disentanglement token binding

Download PDF

Related papers

Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration 2025

ELMGS: Enhancing Memory and Computation Scalability through Compression for 3D Gaussian Splatting 2025

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation 2025

Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach 2025

Disentangling Spatio-Temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video 2025