2023
EMNLP
EMNLP 2023
Entity Disambiguation on a Tight Labeling Budget
Abstract
AbstractMany real-world NLP applications face the challenge of training an entity disambiguation model for a specific domain with a small labeling budget. In this setting there is often access to a large unlabeled pool of documents. It is then natural to ask the question: which samples should be selected for annotation? In this paper we propose a solution that combines feature diversity with low rank correction. Our sampling strategy is formulated in the context of bilinear tensor models. Our experiments show that the proposed approach can significantly reduce the amount of labeled data necessary to achieve a given performance.
🌉
Interdisciplinary Bridge
— Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— bilinear tensor model
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Learning Types > Active Learning
Natural Language Processing > Understanding > Named Entity Recognition
Natural Language Processing > Applications > Named Entity Recognition
Machine Learning > Core Methods > Sparse Optimization
Machine Learning > Learning Paradigms > Active Learning