Polysemantic Dropout: Conformal OOD Detection for Specialized LLMs

Ayush Gupta; Ramneet Kaur; Anirban Roy; Adam D. Cobb; Rama Chellappa; Susmit Jha

2025 EMNLP EMNLP 2025

Polysemantic Dropout: Conformal OOD Detection for Specialized LLMs

Abstract

AbstractWe propose a novel inference-time out-of-domain (OOD) detection algorithm for specialized large language models (LLMs). Despite achieving state-of-the-art performance on in-domain tasks through fine-tuning, specialized LLMs remain vulnerable to incorrect or unreliable outputs when presented with OOD inputs, posing risks in critical applications. Our method leverages the Inductive Conformal Anomaly Detection (ICAD) framework, using a new non-conformity measure based on the model’s dropout tolerance. Motivated by recent findings on polysemanticity and redundancy in LLMs, we hypothesize that in-domain inputs exhibit higher dropout tolerance than OOD inputs. We aggregate dropout tolerance across multiple layers via a valid ensemble approach, improving detection while maintaining theoretical false alarm bounds from ICAD. Experiments with medical-specialized LLMs show that our approach detects OOD inputs better than baseline methods, with AUROC improvements of 2% to 37% when treating OOD datapoints as positives and in-domain test datapoints as negatives.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ayush Gupta , Ramneet Kaur , Anirban Roy , Adam D. Cobb , Rama Chellappa , Susmit Jha

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Optimization & Theory > Statistical Learning Machine Learning > Learning Types > Representation Learning Artificial Intelligence > Core AI > Large Language Models Deep Learning > Models > Large Language Models Machine Learning > Learning Types > Uncertainty Quantification Machine Learning > Learning Types > Out-of-Distribution Detection

Keywords

conformal prediction anomaly detection out-of-domain detection large language model specialized language model

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025