2025 EMNLP EMNLP 2025

FQ-Eval: Building Evaluation Dataset for User-centered Follow-up Question Generation

Abstract

AbstractTo effectively support users’ goal achievement in chat-LLM services, providing user-centered follow-up questions is essential. Existing studies primarily focus on enhancing information-seeking or topical relevance, often missing how follow-up questions could satisfy users’ intrinsic needs and conversational goals. To bridge this gap, we introduce FQ-Eval, a user-centered evaluation dataset designed for assessing follow-up question generation in chat-LLM services. FQ-Eval incorporates realistic chat-LLM usage scenarios and five distinct human-aligned criteria, each reflecting user expectations of effective follow-up questions. Experimental results show that FQ-Eval constructed through our approach clearly capture human-aligned criteria, enabling robust, human-aligned follow-up question generation evaluation of various models and services.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio