Disentangling Emotion Understanding and Generation in Large Language Models

Sadegh Jafari; Els Lefever; Véronique Hoste

2026 EACL EACL 2026

Disentangling Emotion Understanding and Generation in Large Language Models

Abstract

AbstractLarge language models (LLMs) have demonstrated strong performance on emotion understanding tasks, yet their ability to faithfully generate emotionally aligned text remains less well understood.We propose a semantic evaluation framework that jointly assesses emotion understanding, emotion generation, and internal consistency, using a VAE-based emotion cost matrix that captures graded semantic similarity between emotion categories.Our framework introduces four complementary metrics that disentangle baseline understanding, human-perceived emotion in generated text, generation quality, and model consistency.Experimental results show that while understanding and consistency scores are highly correlated, emotion generation exhibits substantially weaker correlations with these metrics.These findings motivate the development of specialized evaluation protocols that independently measure emotional understanding and generation, enabling more reliable assessments of LLM emotional intelligence.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sadegh Jafari , Els Lefever , Véronique Hoste

Topics

Natural Language Processing > Understanding > Sentiment Analysis Natural Language Processing > Resources & Methods > Large Language Models

Keywords

variational autoencoder emotion classification semantic evaluation emotion understanding large language model emotion generation

Download PDF

Related papers

Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health 2026

A Benchmark for Audio Reasoning Capabilities of Multimodal Large Language Models 2026

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection 2026

Generative Personality Simulation via Theory-Informed Structured Interview 2026

Word Surprisal Correlates with Sentential Contradiction in LLMs 2026