2016 COLING COLING 2016

Time-Independent and Language-Independent Extraction of Multiword Expressions From Twitter

Abstract

AbstractMultiword Expressions (MWEs) are crucial lexico-semantic units in any language. However, most work on MWEs has been focused on standard monolingual corpora. In this work, we examine MWE usage on Twitter - an inherently multilingual medium with an extremely short average text length that is often replete with grammatical errors. In this work we present a new graph based, language agnostic method for automatically extracting MWEs from tweets. We show how our method outperforms standard Association Measures. We also present a novel unsupervised evaluation technique to ascertain the accuracy of MWE extraction.

🌉 Interdisciplinary Bridge — Computer Science and Data Science & Analytics
🧭 Keyword Pioneer — association measure
🐣 Hot Topic Early Bird — social media
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio