PANDA: Empowering Small Language Models for Proactive Dialogue Through Agent-Based Synthesis (Student Abstract)

Rongyu Zhang; Dingyuan Zhang; Haopeng Li

2026 AAAI AAAI 2026

PANDA: Empowering Small Language Models for Proactive Dialogue Through Agent-Based Synthesis (Student Abstract)

Abstract

Abstract Proactive dialogue systems, which are designed to guide conversations toward predetermined goals. However, contemporary LLMs predominantly function as passive assistants, mechanically executing human instructions. A key challenge contributing to this limitation is the inherent difficulty in acquiring and annotating high-quality training data for proactive dialogue. Consequently, the scarcity of such data results in a notable deficiency in the proactive conversational capabilities of current LLMs.In this paper, we introduce PANDA (Proactive Agent-based Negotiation Dialogue Augmentation), a method designed to generate accurate, complex, and diverse proactive dialogue data for a challenging task—financial dispute mediation—where a LLM acts as the mediator. PANDA leverages a novel self-evolving synthesis process to manage a pool of user profiles and generate dialogues through structured interactions between multiple LLM-driven agents. To ensure data fidelity, we propose a comprehensive evaluation framework and build a two-level validation system combining automated and expert human verification. Our experiments demonstrate that an 8B-parameter model, trained on our synthesized dataset, achieves state-of-the-art results in the task's evaluation framework. Its performance rivals top closed-source models guided by heavily engineered prompts, even when provided with only essential information.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🧭 Keyword Pioneer — financial dispute mediation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Rongyu Zhang , Dingyuan Zhang , Haopeng Li

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Natural Language Processing > Generation > Dialogue Systems

Keywords

synthetic data generation proactive dialogue large language model multi-agent system dialogue augmentation financial dispute mediation

Download PDF

Related papers

Hi-EF: Benchmarking Emotion Forecasting in Human-interaction 2026

MosaicDoc: A Large-Scale Bilingual Benchmark for Visually Rich Document Understanding 2026

Sparse3DPR: Training-Free 3D Hierarchical Scene Parsing and Task-Adaptive Subgraph Reasoning from Sparse RGB Views 2026

LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning 2026

HDGS: Hierarchical Dynamic Gaussian Splatting for Urban Driving Scenes 2026