Derivational Probing: Unveiling the Layer-wise Derivation of Syntactic Structures in Neural Language Models

Taiga Someya; Ryo Yoshida; Hitomi Yanaka; Yohei Oseki

2025 CONLL CoNLL 2025

Derivational Probing: Unveiling the Layer-wise Derivation of Syntactic Structures in Neural Language Models

Abstract

AbstractRecent work has demonstrated that neural language models encode syntactic structures in their internal *representations*, yet the *derivations* by which these structures are constructed across layers remain poorly understood. In this paper, we propose *Derivational Probing* to investigate how micro-syntactic structures (e.g., subject noun phrases) and macro-syntactic structures (e.g., the relationship between the root verbs and their direct dependents) are constructed as word embeddings propagate upward across layers.Our experiments on BERT reveal a clear bottom-up derivation: micro-syntactic structures emerge in lower layers and are gradually integrated into a coherent macro-syntactic structure in higher layers.Furthermore, a targeted evaluation on subject-verb number agreement shows that the timing of constructing macro-syntactic structures is critical for downstream performance, suggesting an optimal timing for integrating global syntactic information.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Taiga Someya , Ryo Yoshida , Hitomi Yanaka , Yohei Oseki

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Core Methods > Representation Learning

Keywords

layer-wise analysis internal representation syntactic structure subject-verb agreement probe classifier derivational probing

Download PDF

Related papers

LawToken: a single token worth more than its constituents 2025

Interpersonal Memory Matters: A New Task for Proactive Dialogue Utilizing Conversational History 2025

WinoWhat: A Parallel Corpus of Paraphrased WinoGrande Sentences with Common Sense Categorization 2025

Planning for Success: Exploring LLM Long-term Planning Capabilities in Table Understanding 2025

Experiential Semantic Information and Brain Alignment: Are Multimodal Models Better than Language Models? 2025