2026 WACV WACV 2026

Exploring the Boundaries of Diffusion Models for Offline Writer Identification with Sparse and Intra-Variable Data

Abstract

Offline writer identification poses significant challenges when training data is scarce, and handwriting styles exhibit high intra-writer variability. This scenario is common in practical applications such as forensic analysis and historical document authentication, where only a limited number of handwritten samples are available per writer. In this paper, we explore the viability of using diffusion models to capture writer-specific traits under such challenging conditions. Specifically, we investigate their performance in both text-dependent and text-independent setups, where lexical similarity varies across samples. We propose a novel diffusion-based writer identification framework that integrates a style encoder and handcrafted textural features in a joint training pipeline. Our approach is evaluated on a recent dataset with high intra-writer variability as well as three benchmark datasets (IAM, CERUG-EN, and CVL). Experimental results demonstrate that while diffusion models excel in text-dependent scenarios, their generalization capability diminishes in text-independent settings due to the entanglement of content and style features. This study highlights both the promise and the current limitations of generative diffusion models for fine-grained handwriting style modeling. We identify avenues for improving generalization through disentangled representations, domain adaptation, and hybrid discriminative-generative architectures. The proposed framework contributes to the growing efforts toward scalable, style-aware writer identification in real-world, unconstrained handwriting scenarios.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning
🧭 Keyword Pioneer — offline recognition
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio