Literal or idiomatic? Identifying the reading of single occurrences of German multiword expressions using word embeddings

Rafael Ehren

2017 EACL EACL 2017

Literal or idiomatic? Identifying the reading of single occurrences of German multiword expressions using word embeddings

Abstract

AbstractNon-compositional multiword expressions (MWEs) still pose serious issues for a variety of natural language processing tasks and their ubiquity makes it impossible to get around methods which automatically identify these kind of MWEs. The method presented in this paper was inspired by Sporleder and Li (2009) and is able to discriminate between the literal and non-literal use of an MWE in an unsupervised way. It is based on the assumption that words in a text form cohesive units. If the cohesion of these units is weakened by an expression, it is classified as literal, and otherwise as idiomatic. While Sporleder an Li used Normalized Google Distance to modell semantic similarity, the present work examines the use of a variety of different word embeddings.

❓ The Questioner

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Rafael Ehren

Topics

Machine Learning > Learning Types > Unsupervised Learning Natural Language Processing > Understanding > Semantic Analysis

Keywords

unsupervised learning semantic analysis word embedding multiword expression

Download PDF

Related papers

Cross-Lingual Dependency Parsing with Late Decoding for Truly Low-Resource Languages 2017

Learning and Knowledge Transfer with Memory Networks for Machine Comprehension 2017

Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings 2017

Building Web-Interfaces for Vector Semantic Models with the WebVectors Toolkit 2017

Assessing Convincingness of Arguments in Online Debates with Limited Number of Features 2017