2026 AAAI AAAI 2026

Quality-Aware Language-Conditioned Local Auto-Regressive Anomaly Synthesis and Detection

Abstract

Abstract Despite substantial progress in anomaly synthesis, existing diffusion-based and coarse inpainting pipelines commonly suffer from structural deficiencies such as micro-structural discontinuities, limited semantic controllability, and inefficient generation. To overcome these limitations, we introduce ARAS, a language-conditioned, auto-regressive anomaly synthesis approach that precisely injects local, text-specified defects into normal images via token-anchored latent editing. Leveraging a hard-gated auto-regressive operator and a training-free, context-preserving masked sampling kernel, ARAS significantly enhances defect realism, preserves fine-grained material textures, and provides continuous semantic control over synthesized anomalies. Integrated within our Quality-Aware Re-weighted Anomaly Detection (QARAD) framework, we propose a dynamic weighting strategy that emphasizes high-quality synthetic samples by computing an image-text similarity score with a dual-encoder model. Extensive experiments across three datasets, MVTec AD, VisA, and BTAD, demonstrate that our QARAD outperforms SOTA methods in both image- and pixel-level anomaly detection tasks, achieving improved accuracy, robustness, and a 5× synthesis speedup compared to diffusion-based alternatives.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio