Simple Yet Effective: An Information-Theoretic Approach to Multi-LLM Uncertainty Quantification

Maya Kruse; Majid Afshar; Saksham Khatwani; Anoop Mayampurath; Guanhua Chen; Yanjun Gao

2025 EMNLP EMNLP 2025

Simple Yet Effective: An Information-Theoretic Approach to Multi-LLM Uncertainty Quantification

Abstract

AbstractLarge language models (LLMs) often behave inconsistently across inputs, indicating uncertainty and motivating the need for its quantification in high-stakes settings. Prior work on calibration and uncertainty quantification often focuses on individual models, overlooking the potential of model diversity. We hypothesize that LLMs make complementary predictions due to differences in training and the Zipfian nature of language, and that aggregating their outputs leads to more reliable uncertainty estimates. To leverage this, we propose MUSE (Multi-LLM Uncertainty via Subset Ensembles), a simple information-theoretic method that uses Jensen-Shannon Divergence to identify and aggregate well-calibrated subsets of LLMs. Experiments on binary prediction tasks demonstrate improved calibration and predictive performance compared to single-model and naive ensemble baselines. In addition, we explore using MUSE as guided signals with chain-of-thought distillation to fine-tune LLMs for calibration. MUSE is available at: https://github.com/LARK-NLP-Lab/MUSE.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — subset ensemble

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Maya Kruse , Majid Afshar , Saksham Khatwani , Anoop Mayampurath , Guanhua Chen , Yanjun Gao

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Statistical Learning Natural Language Processing > Resources & Methods > Large Language Models Machine Learning > Optimization & Theory > Information Theory Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > Ensemble Learning Deep Learning > Models > Large Language Models Machine Learning > Learning Types > Uncertainty Quantification

Keywords

ensemble learning uncertainty quantification ensemble method language model jensen-shannon divergence large language model chain-of-thought distillation subset ensemble multi-llm ensemble

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025