Detecting Subtle Sense Shift with Polysemy-Aware Trends

Ondřej Herman; Pavel Rychlý

2026 EACL EACL 2026

Detecting Subtle Sense Shift with Polysemy-Aware Trends

Abstract

AbstractLanguage changes faster than dictionaries can be revised, yet automatic tools still struggle to spot the subtle, short-term shifts in meaning that precede a formal update. We present a language-independent pipeline that detects word-sense shifts in large, time-stamped web corpora. The method couples a robust re-implementation of the Adaptive Skip-Gram model, which induces multiple sense vectors per lemma without any external inventory, with a second stage that tracks each sense through time under three alternative frequency normalizations. Linear Regression and the robust Mann-Kendall/Theil-Sen estimator then test whether a sense’s frequency slope deviates significantly from zero, producing a ranked list of headwords whose semantics are drifting.We evaluate the system on the English (12 B tokens) and Czech (1 B tokens) Timestamped corpora for May 2023-May 2025. Expert annotation of the top-100 candidates for each model variant shows that 50.7% of Czech and 25.7% of English headwords exhibit genuine sense shifts, despite web-scale noise.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ondřej Herman , Pavel Rychlý

Topics

Machine Learning > Core Methods > Representation Learning

Keywords

representation learning word sense disambiguation temporal analysis frequency normalization semantic drift vector space model

Download PDF

Related papers

Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health 2026

A Benchmark for Audio Reasoning Capabilities of Multimodal Large Language Models 2026

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection 2026

Generative Personality Simulation via Theory-Informed Structured Interview 2026

Word Surprisal Correlates with Sentential Contradiction in LLMs 2026