2025 EMNLP EMNLP 2025

Towards Author-informed NLP: Mind the Social Bias

Abstract

AbstractSocial text understanding is prone to fail when opinions are conveyed implicitly or sarcastically. It is therefore desired to model users’ contexts in processing the texts authored by them. In this work, we represent users within a social embedding space that was learned from the Twitter network at large-scale. Similar to word embeddings that encode lexical semantics, the network embeddings encode latent dimensions of social semantics. We perform extensive experiments on author-informed stance prediction, demonstrating improved generalization through inductive social user modeling, both within and across topics. Similar results were obtained for author-informed toxicity and incivility detection. The proposed approach may pave way to social NLP that considers user embeddings as contextual modality. However, our investigation also reveals that user stances are correlated with the personal socio-demographic traits encoded in their embeddings. Hence, author-informed NLP approaches may inadvertently model and reinforce socio-demographic and other social biases.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — social embedding
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio