Universal Post-Processing Networks for Joint Optimization of Modules in Task-Oriented Dialogue Systems

Atsumoto Ohashi; Ryuichiro Higashinaka

2025 AAAI AAAI 2025

Universal Post-Processing Networks for Joint Optimization of Modules in Task-Oriented Dialogue Systems

Abstract

Abstract Post-processing networks (PPNs) are components that modify the outputs of arbitrary modules in task-oriented dialogue systems and are optimized using reinforcement learning (RL) to improve the overall task completion capability of the system. However, previous PPN-based approaches have been limited to handling only a subset of modules within a system, which poses a significant limitation in improving the system performance. In this study, we propose a joint optimization method for post-processing the outputs of all modules using universal post-processing networks (UniPPNs), which are language-model-based networks that can modify the outputs of arbitrary modules in a system as a sequence-transformation task. Moreover, our RL algorithm, which employs a module-level Markov decision process, enables fine-grained value and advantage estimation for each module, thereby stabilizing joint learning for post-processing the outputs of all modules. Through both simulation-based and human evaluation experiments using the MultiWOZ dataset, we demonstrated that UniPPN outperforms conventional PPNs in the task completion capability of task-oriented dialogue systems.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing and Reinforcement Learning

🧭 Keyword Pioneer — module-level markov decision process

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Atsumoto Ohashi , Ryuichiro Higashinaka

Topics

Artificial Intelligence > Core AI > Agent Systems Machine Learning > Optimization & Theory > Optimization Reinforcement Learning > Methods > Deep RL Machine Learning > Learning Types > Reinforcement Learning Natural Language Processing > Applications > Dialogue Systems Artificial Intelligence > Core AI > Dialogue Systems

Keywords

reinforcement learning task-oriented dialogue language model joint optimization dialogue system task completion post-processing network module-level markov decision process

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025