2021 EMNLP EMNLP 2021

DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages

Abstract

AbstractWord meaning is notoriously difficult to capture, both synchronically and diachronically. In this paper, we describe the creation of the largest resource of graded contextualized, diachronic word meaning annotation in four different languages, based on 100,000 human semantic proximity judgments. We describe in detail the multi-round incremental annotation process, the choice for a clustering algorithm to group usages into senses, and possible – diachronic and synchronic – uses for this dataset.

🌉 Interdisciplinary Bridge — Interdisciplinary and Knowledge & Reasoning and Machine Learning
🧭 Keyword Pioneer — diachronic word usage
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio