Quantifying Logical Consistency in Transformers via Query-Key Alignment

Eduard Tulchinskii; Laida Kushnareva; Anastasia Voznyuk; Andrei Andriiainen; Irina Piontkovskaya; Evgeny Burnaev; Serguei Barannikov

2025 EMNLP EMNLP 2025

Quantifying Logical Consistency in Transformers via Query-Key Alignment

Abstract

AbstractLarge language models (LLMs) excel at many NLP tasks, yet their multi-step logical reasoning remains unreliable. Existing solutions such as Chain-of-Thought prompting generate intermediate steps but provide no internal check of their logical coherence. In this paper, we use the “QK-score”, a lightweight metric based on query–key alignments within transformer attention heads, to evaluate the logical reasoning capabilities of LLMs. Our method automatically identifies attention heads that play a key role in distinguishing valid from invalid logical inferences, enabling efficient inference-time evaluation via a single forward pass. It reveals latent reasoning structure in LLMs and provides a scalable mechanistic alternative to ablation-based analysis. Across three benchmarks: ProntoQA-OOD, PARARULE-Plus, and MultiLogicEval, with models ranging from 1.5B to 70B parameters, the selected heads predict logical validity up to 14% better than the model probabilities, and remain robust under distractors and increasing reasoning depth of d≤ 6.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning

🧭 Keyword Pioneer — query-key alignment

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Eduard Tulchinskii , Laida Kushnareva , Anastasia Voznyuk , Andrei Andriiainen , Irina Piontkovskaya , Evgeny Burnaev , Serguei Barannikov

Topics

Artificial Intelligence > Core AI > Interpretability Deep Learning > Architectures > Transformers Artificial Intelligence > Core AI > Large Language Models Artificial Intelligence > Core AI > Reasoning Deep Learning > Techniques > Attention Mechanism

Keywords

logical reasoning attention head logical consistency transformer interpretability transformer attention mechanistic analysis query-key alignment inference-time evaluation

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025