2026 AAAI AAAI 2026

SCoUT: A Framework for Structured Stereotype Analysis in Language Models

Abstract

Abstract Existing stereotype auditing methods for Large Language Models (LLMs) typically rely on isolated rating schemes or task-specific probes, lacking theoretical grounding and failing to reveal internal organization beyond surface-level output patterns. In this paper, we introduce SCoUT (Stereotype Content-oriented Utility structure via Thurstonian modeling), a closed-loop framework that structurally models, explicitly probes, and functionally steers stereotype dimensions (warmth and competence) in LLMs. SCoUT first reconstructs a global stereotype utility structure aligned with Stereotype Content Model theory via Thurstonian comparative judgments. Across multiple open-source LLMs, this modeling achieves high pairwise-preference prediction accuracy (≥ 0.90 on larger-scale models) and exhibits strong cross-model consistency. Probing internal attention mechanisms localizes this structure to specific heads (Spearman’s ρ up to 0.83 for warmth and 0.90 for competence) and surfaces a salient asymmetry between warmth and competence. Further, targeted inference-time activation modifications on these dimension-sensitive heads consistently steer model outputs along the intended axes. By bridging behavioral measurement with internal representation and controllable steering, SCoUT offers an end-to-end framework that uncovers and interprets the latent structure of stereotypes, advancing stereotype auditing from surface detection to structural analysis.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning
🧭 Keyword Pioneer — warmth and competence
🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing