Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models

Anthony Sicilia; Hyunwoo Kim; Khyathi Chandu; Malihe Alikhani; Jack Hessel

2024 ACL ACL 2024

Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models

Abstract

AbstractEffective interlocutors account for the uncertain goals, beliefs, and emotions of others. But even the best human conversationalist cannot perfectly anticipate the trajectory of a dialogue. How well can language models represent inherent uncertainty in conversations? We propose FortUne Dial, an expansion of the long-standing “conversation forecasting” task: instead of just accuracy, evaluation is conducted with uncertainty-aware metrics, effectively enabling abstention on individual instances. We study two ways in which language models potentially represent outcome uncertainty (internally, using scores and directly, using tokens) and propose fine-tuning strategies to improve calibration of both representations. Experiments on eight difficult negotiation corpora demonstrate that our proposed fine-tuning strategies (a traditional supervision strategy and an off-policy reinforcement learning strategy) can calibrate smaller open-source models to compete with pre-trained models 10x their size.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Anthony Sicilia , Hyunwoo Kim , Khyathi Chandu , Malihe Alikhani , Jack Hessel

Topics

Artificial Intelligence > Core AI > Foundation Models Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > Uncertainty Quantification Artificial Intelligence > Core AI > Dialogue Systems

Keywords

uncertainty quantification dialogue system conversation forecasting large language model

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024