Learning Mask-Aware Offsets: Two-branch Deformable Attention Networks for Inpainting with Masked Region Avoidance
Abstract
Image inpainting restores missing regions in a visually plausible way, yet existing methods struggle with irregular holes and complex structures due to fixed kernels and static attention. In this paper, we propose Mask-Aware Deformable Inpainting Network (MADIN), which incorporates mask information for position-aware control. The proposed model employs a Two-branch Offset Estimator that jointly utilizes query features and mask signals to reliably predict reference positions even within masked regions. In addition, Adaptive Offset Range Scaling adjusts offset magnitude based on masking ratio to capture broader context when needed. Experiments on CelebA-HQ and Places2 show that MADIN achieves state-of-the-art results in PSNR, SSIM, LPIPS, and FID, while remaining lightweight and efficient. The code will be released at https://github.com/ohhhyeongsn/MADIN