2025 ICCV ICCV 2025

Learning Implicit Features with Flow-Infused Transformations for Realistic Virtual Try-On

Abstract

Diffusion-based virtual try-on aims to synthesize a realistic image that seamlessly integrating the specific garment into a target model. The primary challenge lies in effectively guiding the warping process of the latent diffusion model. However, previous methods either lack direct guidance or explicitly warp the garment image, which highly depends on the performance of the warping module. In this paper, we propose FIA-VTON, which leverages the implicit flow feature as guidance by adopting a Flow Infused Attention module on virtual try-on. The dense warp flow map is projected as indirect guidance to enhance the feature map warping in the generation process implicitly, which is less sensitive to the warping estimation accuracy than an explicit warp of the garment image. To further enhance implicit warp guidance, we incorporate high-level spatial attention to complement the dense warp. Experimental results on the VTON-HD and DressCode dataset significantly outperform state-of-the-art methods, demonstrating that FIA-VTON is effective and robust for virtual try-on.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning
🧭 Keyword Pioneer — flow-infused attention
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio