2025 EMNLP EMNLP 2025

Just One is Enough: An Existence-based Alignment Check for Robust Japanese Pronunciation Estimation

Abstract

AbstractNeural models for Japanese pronunciation estimation often suffer from errors such ashallucinations (generating pronunciations that are not grounded in the input) and omissions (skipping parts of the input).Although attention-based alignment has been used to detect such errors,selecting reliable attention heads is difficult,and developing methods that can both detect and correct these errorsremains challenging.In this paper, we propose a simple method calledexistence-based alignment check.In this approach,we consider alignment candidatesindependently extracted from all attention heads,and check whether at least one of these candidates satisfies two conditionsderived from the linguistic properties of Japanese pronunciation:monotonicity and pronunciation length per character.We generate multiple hypotheses using beam searchand use the alignment check as a filtering mechanismto correct hallucinations and omissions.We apply this method to a dataset of Japanese facility namesand demonstrate that it improves pronunciation estimation accuracyby over 2.5%.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — pronunciation estimation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio