TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning

Hang Ni; Fan LIU; Xinyu Ma; Lixin Su; Shuaiqiang Wang; Dawei Yin; Hui Xiong; Hao Liu

2025 EMNLP EMNLP 2025

TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning

Abstract

AbstractLarge language models (LLMs) have shown promise in automating travel planning, yet they often fall short in addressing nuanced spatiotemporal rationality. While existing benchmarks focus on basic plan validity, they neglect critical aspects such as route efficiency, POI appeal, and real-time adaptability. This paper introduces **TP-RAG**, the first benchmark tailored for retrieval-augmented, spatiotemporal-aware travel planning. Our dataset includes 2,348 real-world travel queries, 85,575 fine-grain annotated POIs, and 18,784 high-quality travel trajectory references sourced from online tourist documents, enabling dynamic and context-aware planning. Through extensive experiments, we reveal that integrating reference trajectories significantly improves spatial efficiency and POI rationality of the travel plan, while challenges persist in universality and robustness due to conflicting references and noisy data. To address these issues, we propose *EvoRAG*, an evolutionary framework that potently synergizes diverse retrieved trajectories with LLMs’ intrinsic reasoning. *EvoRAG* achieves state-of-the-art performance, improving spatiotemporal compliance and reducing commonsense violation compared to ground-up and retrieval-augmented baselines. Our work underscores the potential of hybridizing Web knowledge with LLM-driven optimization, paving the way for more reliable and adaptive travel planning agents.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hang Ni , Fan LIU , Xinyu Ma , Lixin Su , Shuaiqiang Wang , Dawei Yin , Hui Xiong , Hao Liu

Topics

Artificial Intelligence > Core AI > Agent Systems Artificial Intelligence > Learning Paradigms > Transfer Learning Natural Language Processing > Applications > Question Answering Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > Retrieval-Augmented Generation Natural Language Processing > Generation > Retrieval-Augmented Generation

Keywords

information retrieval benchmark dataset retrieval-augmented generation poi recommendation llm agent large language model travel planning spatiotemporal reasoning

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025