Emo3D: Metric and Benchmarking Dataset for 3D Facial Expression Generation from Emotion Description

Mahshid Dehghani; Amirahmad Shafiee; Ali Shafiei; Neda Fallah; Farahmand Alizadeh; Mohammad Mehdi Gholinejad; Hamid Behroozi; Jafar Habibi; Ehsaneddin Asgari

2025 NAACL NAACL 2025

Emo3D: Metric and Benchmarking Dataset for 3D Facial Expression Generation from Emotion Description

Abstract

Abstract3D facial emotion modeling has important applications in areas such as animation design, virtual reality, and emotional human-computer interaction (HCI). However, existing models are constrained by limited emotion classes and insufficient datasets. To address this, we introduce Emo3D, an extensive “Text-Image-Expression dataset” that spans a wide spectrum of human emotions, each paired with images and 3D blendshapes. Leveraging Large Language Models (LLMs), we generate a diverse array of textual descriptions, enabling the capture of a broad range of emotional expressions. Using this unique dataset, we perform a comprehensive evaluation of fine-tuned language-based models and vision-language models, such as Contrastive Language-Image Pretraining (CLIP), for 3D facial expression synthesis. To better assess conveyed emotions, we introduce Emo3D metric, a new evaluation metric that aligns more closely with human perception than traditional Mean Squared Error (MSE). Unlike MSE, which focuses on numerical differences, Emo3D captures emotional nuances in visual-text alignment and semantic richness. Emo3D dataset and metric hold great potential for advancing applications in animation and virtual reality.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Mahshid Dehghani , Amirahmad Shafiee , Ali Shafiei , Neda Fallah , Farahmand Alizadeh , Mohammad Mehdi Gholinejad , Hamid Behroozi , Jafar Habibi , Ehsaneddin Asgari

Topics

Artificial Intelligence > Core AI > Multimodal Learning Natural Language Processing > Generation > Text Generation

Keywords

emotion recognition text-to-image generation vision-language model clip model 3d facial expression

Download PDF

Few-shot Personalization of LLMs with Mis-aligned Responses 2025

NLI under the Microscope: What Atomic Hypothesis Decomposition Reveals 2025

Understanding Figurative Meaning through Explainable Visual Entailment 2025

CogLM: Tracking Cognitive Development of Large Language Models 2025

Emo3D: Metric and Benchmarking Dataset for 3D Facial Expression Generation from Emotion Description

Abstract

Authors

Topics

Keywords

Related papers