2021
EMNLP
EMNLP 2021
WikiGUM: Exhaustive Entity Linking for Wikification in 12 Genres
Abstract
AbstractPrevious work on Entity Linking has focused on resources targeting non-nested proper named entity mentions, often in data from Wikipedia, i.e. Wikification. In this paper, we present and evaluate WikiGUM, a fully wikified dataset, covering all mentions of named entities, including their non-named and pronominal mentions, as well as mentions nested within other mentions. The dataset covers a broad range of 12 written and spoken genres, most of which have not been included in Entity Linking efforts to date, leading to poor performance by a pretrained SOTA system in our evaluation. The availability of a variety of other annotations for the same data also enables further research on entities in context.
🧭
Keyword Pioneer
— pronominal mention
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio
Authors
Topics
Natural Language Processing > Understanding > Named Entity Recognition
Natural Language Processing > Applications > Information Extraction
Natural Language Processing > Resources & Methods > Text Representation
Natural Language Processing > Applications > Named Entity Recognition
Natural Language Processing > Applications > Entity Linking