Exploring the Value of Personalized Word Embeddings

Charles Welch; Jonathan K. Kummerfeld; Veronica Perez-Rosas; Rada Mihalcea

2020 COLING COLING 2020

Exploring the Value of Personalized Word Embeddings

Abstract

AbstractIn this paper, we introduce personalized word embeddings, and examine their value for language modeling. We compare the performance of our proposed prediction model when using personalized versus generic word representations, and study how these representations can be leveraged for improved performance. We provide insight into what types of words can be more accurately predicted when building personalized models. Our results show that a subset of words belonging to specific psycholinguistic categories tend to vary more in their representations across users and that combining generic and personalized word embeddings yields the best performance, with a 4.7% relative reduction in perplexity. Additionally, we show that a language model using personalized word embeddings can be effectively used for authorship attribution.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐣 Hot Topic Early Bird — authorship attribution

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Charles Welch , Jonathan K. Kummerfeld , Veronica Perez-Rosas , Rada Mihalcea

Topics

Machine Learning > Core Methods > Embedding Learning Natural Language Processing > Generation > Language Modeling Natural Language Processing > Resources & Methods > Text Representation

Keywords

language modeling authorship attribution perplexity reduction word representation personalized word embedding

Download PDF

Related papers

Persuasiveness of News Editorials depending on Ideology and Personality 2020

A Graph Representation of Semi-structured Data for Web Question Answering 2020

Span-based Joint Entity and Relation Extraction with Attention-based Span-specific and Contextual Semantic Representations 2020

Hierarchical Chinese Legal event extraction via Pedal Attention Mechanism 2020

End-to-End Emotion-Cause Pair Extraction with Graph Convolutional Network 2020