Learning Topic-Sensitive Word Representations

Marzieh Fadaee; Arianna Bisazza; Christof Monz

2017 ACL ACL 2017

Learning Topic-Sensitive Word Representations

Abstract

AbstractDistributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling topics and integrating topic distributions for each document we obtain representations that are able to distinguish between different meanings of a given word. Our models yield statistically significant improvements for the lexical substitution task indicating that commonly used single word representations, even when combined with contextual information, are insufficient for this task.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

📈 Trend Setter — Lexical Semantics

🧭 Keyword Pioneer — lexical substitution

🐣 Hot Topic Early Bird — word representation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Marzieh Fadaee , Arianna Bisazza , Christof Monz

Topics

Machine Learning > Core Methods > Representation Learning Natural Language Processing > Resources & Methods > Lexical Semantics Machine Learning > Bayesian & Probabilistic > Bayesian Learning Machine Learning > Bayesian & Probabilistic > Probabilistic Modeling Machine Learning > Core Methods > Topic Modeling Natural Language Processing > Understanding > Lexical Semantics

Keywords

bayesian learning topic modeling hierarchical dirichlet process distributional semantics topic model distributed representation word representation lexical substitution topic-sensitive embedding multi-sense learning

Download PDF

Related papers

A* CCG Parsing with a Supertag and Dependency Factored Model 2017

Detecting annotation noise in automatically labelled data 2017

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2017

Annotating tense, mood and voice for English, French and German 2017

Word Embedding for Response-To-Text Assessment of Evidence 2017