Perceive the Passage of Time: A Systematic Evaluation of Large Language Model in Temporal Relativity

Shuang Chen; Yining Zheng; Shimin Li; Qinyuan Cheng; Xipeng Qiu

2025 COLING COLING 2025

Perceive the Passage of Time: A Systematic Evaluation of Large Language Model in Temporal Relativity

Abstract

AbstractTemporal perception is crucial for Large Language Models(LLMs) to effectively understand the world. However, current benchmarks primarily focus on temporal reasoning, falling short in understanding the temporal characteristics involving temporal perception, particularly in understanding temporal relativity. In this paper, we introduce TempBench, a comprehensive benchmark designed to evaluate the temporal-relative ability of LLMs. TempBench encompasses 4 distinct scenarios: Physiology, Psychology, Cognition and Mixture. We conduct an extensive experiments on GPT-4, a series of Llama and other popular LLMs. The experiment results demonstrate a significant performance gap between LLMs and humans in temporal-relative capability. Furthermore, the error types of temporal-relative ability in LLMs are proposed to thoroughly analyze the impact of multiple aspects and emphasize the associated challenges. We anticipate that TempBench will drive further advancements in enhancing the temporal-perceiving capabilities of L

🧭 Keyword Pioneer — temporal relativity

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning

Authors

Shuang Chen , Yining Zheng , Shimin Li , Qinyuan Cheng , Xipeng Qiu

Topics

Natural Language Processing > Resources & Methods > Large Language Models

Keywords

temporal reasoning temporal perception llm benchmark temporal relativity

Download PDF

Related papers

Navigating Dialectal Bias and Ethical Complexities in Levantine Arabic Hate Speech Detection 2025

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution 2025

Positive Text Reframing under Multi-strategy Optimization 2025

RAM2C: A Liberal Arts Educational Chatbot based on Retrieval-augmented Multi-role Multi-expert Collaboration 2025

Two-stage Incomplete Utterance Rewriting on Editing Operation 2025