2024 EMNLP EMNLP 2024

Efficient Overshadowed Entity Disambiguation by Mitigating Shortcut Learning

Abstract

AbstractEntity disambiguation (ED) is crucial in natural language processing (NLP) for tasks such as question-answering and information extraction. A major challenge in ED is handling overshadowed entities—uncommon entities sharing mention surfaces with common entities. The current approach to enhance performance on these entities involves reasoning over facts in a knowledge base (KB), increasing computational overhead during inference. We argue that the ED performance on overshadowed entities can be enhanced during training by addressing shortcut learning, which does not add computational overhead at inference. We propose a simple yet effective debiasing technique to prevent models from shortcut learning during training. Experiments on a range of ED datasets show that our method achieves state-of-the-art performance without compromising inference speed. Our findings suggest a new research direction for improving entity disambiguation via shortcut learning mitigation.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — overshadowed entity
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio