Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models

Yin Jou Huang; Rafik Hadfi

2025 EMNLP EMNLP 2025

Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models

Abstract

AbstractSelf-report questionnaires have long been used to assess LLM personality traits, yet they fail to capture behavioral nuances due to biases and meta-knowledge contamination. This paper proposes a novel multi-observer framework for personality trait assessments in LLM agents that draws on informant-report methods in psychology. Instead of relying on self-assessments, we employ multiple observer LLM agents, each of which is configured with a specific relationship (e.g., family member, friend, or coworker). The observer agents interact with the subject LLM agent before assessing its Big Five personality traits. We show that observer-report ratings align more closely with human judgments than traditional self-reports and reveal systematic biases in LLM self-assessments. Further analysis shows that aggregating ratings of multiple observers provides more reliable results, reflecting a wisdom of the crowd effect up to 5 to 7 observers.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🧭 Keyword Pioneer — multi-observer framework

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yin Jou Huang , Rafik Hadfi

Topics

Artificial Intelligence > Core AI > Human-AI Interaction Artificial Intelligence > Core AI > Multi-Agent Systems Natural Language Processing > Resources & Methods > Large Language Models

Keywords

personality assessment big five personality large language model multi-agent system multi-observer framework self-report bia

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025