CP-Router: An Uncertainty-Aware Router Between LLM and LRM

Jiayuan Su; Fulin Lin; Zhaopeng Feng; Han Zheng; Teng Wang; Zhenyu Xiao; Xinlong Zhao; Zuozhu Liu; Lu Cheng; Hongwei Wang

2026 AAAI AAAI 2026

CP-Router: An Uncertainty-Aware Router Between LLM and LRM

Abstract

Abstract Recent advances in large reasoning models (LRMs) have significantly enhanced long-chain reasoning capabilities over standard large language models (LLMs). However, LRMs often produce unnecessarily lengthy outputs even for simple queries, leading to inefficiencies or even accuracy degradation compared to LLMs. To address this, we propose CP-Router, a training-free, model-agnostic routing framework that dynamically selects between an LLM and an LRM, demonstrated with multiple-choice question answering (MCQA) prompts. The routing decision is guided by the prediction uncertainty estimates derived via Conformal Prediction (CP), which provides rigorous coverage guarantees. To improve uncertainty differentiation across inputs, we introduce Full and Binary Entropy (FBE), a novel entropy-based criterion that adaptively selects the appropriate CP threshold. Experiments across MCQA and QA benchmarks—including mathematics, logical reasoning, and Chinese chemistry—demonstrate that CP-Router efficiently reduces token usage while maintaining or even improving accuracy compared to using LRM alone. We further demonstrate the generality and robustness of CP-Router by extending it to diverse model pairings beyond the LLM–LRM setting.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jiayuan Su , Fulin Lin , Zhaopeng Feng , Han Zheng , Teng Wang , Zhenyu Xiao , Xinlong Zhao , Zuozhu Liu , Lu Cheng , Hongwei Wang

Topics

Artificial Intelligence > Core AI > Foundation Models Artificial Intelligence > Core AI > Model Compression Machine Learning > Optimization & Theory > Bayesian Inference

Keywords

conformal prediction uncertainty quantification model routing large reasoning model large language model token optimization

Download PDF

Related papers

Hi-EF: Benchmarking Emotion Forecasting in Human-interaction 2026

MosaicDoc: A Large-Scale Bilingual Benchmark for Visually Rich Document Understanding 2026

Sparse3DPR: Training-Free 3D Hierarchical Scene Parsing and Task-Adaptive Subgraph Reasoning from Sparse RGB Views 2026

LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning 2026

HDGS: Hierarchical Dynamic Gaussian Splatting for Urban Driving Scenes 2026