Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground

Adil Soubki; John Murzaku; Arash Yousefi Jordehi; Peter Zeng; Magdalena Markowska; Seyed Abolghasem Mirroshandel; Owen Rambow

2024 ACL ACL 2024

Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground

Abstract

AbstractEvaluating the theory of mind (ToM) capabilities of language models (LMs) has recently received a great deal of attention. However, many existing benchmarks rely on synthetic data, which risks misaligning the resulting experiments with human behavior. We introduce the first ToM dataset based on naturally occurring spoken dialogs, Common-ToM, and show that LMs struggle to demonstrate ToM. We then show that integrating a simple, explicit representation of beliefs improves LM performance on Common-ToM.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Adil Soubki , John Murzaku , Arash Yousefi Jordehi , Peter Zeng , Magdalena Markowska , Seyed Abolghasem Mirroshandel , Owen Rambow

Topics

Natural Language Processing > Resources & Methods > Large Language Models

Keywords

language model theory of mind common ground dialogue understanding belief representation

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024