2026 EACL EACL 2026

Principled Self-Correction in Discrete Diffusion: A UCB-Guided Framework for Text Generation

Abstract

AbstractInspired by their success in image synthesis, diffusion models offer a flexible, iterative alternative to rigid left-to-right text generation. However, a fundamental training-inference discrepancy hinders their performance: models are trained on corrupted ground-truth tokens, but at inference time they must denoise inputs corrupted from their own predictions. To bridge this gap, we propose a unified framework. First, Deeper Self-Prediction (DSP) is a multi-step training objective that teaches robust self-correction by forcing the model to denoise its own intermediate outputs. Second, UCB-guided Decoding is a principled inference algorithm that frames token re-masking as a multi-armed bandit problem, using the Upper Confidence Bound (UCB) to balance exploration and exploitation. Experiments on text generation tasks demonstrate consistent improvements over existing diffusion baselines. The framework achieves higher faithfulness and coherence according to both automatic metrics and LLM-as-a-Judge evaluations.

🧭 Keyword Pioneer — ucb-guided decoding
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio