AnyBald: Toward Realistic Diffusion-Based Hair Removal In-The-Wild
Abstract
We present AnyBald, a novel framework for realistic hair removal from portrait images captured under diverse in-the-wild conditions. One of the key challenges in this task is the lack of high-quality paired data, as existing datasets are often low-quality, with limited viewpoint variation and diversity, making it difficult to handle real-world cases. To address this, we construct a scalable data-augmentation pipeline that synthesizes high-quality hair and non-hair image pairs across diverse real-world scenarios, enabling effective generalization and scalable supervision. With this enriched dataset, we present a new hair-removal framework that reformulates pretrained latent diffusion inpainting using learnable text prompts, removing the need for explicit masking at inference. In doing so, our model achieves natural hair removal with semantic preservation via implicit localization. To further enhance spatial precision, we introduce a regularization loss that guides the model to attend specifically to hair regions. Extensive experiments demonstrate that AnyBald outperforms in removing hair while preserving identity and semantics across various in-the-wild domains. Our project page is here: https://vision3d-lab.github.io/anybald/.