Extracting Linguistic Information from Large Language Models: Syntactic Relations and Derivational Knowledge

Tsedeniya Kinfe Temesgen; Marion Di Marco; Alexander Fraser

2025 EMNLP EMNLP 2025

Extracting Linguistic Information from Large Language Models: Syntactic Relations and Derivational Knowledge

Abstract

AbstractThis paper presents a study of the linguistic knowledge and generalization capabilities of Large Language Models (LLMs), focusing ontheir morphosyntactic competence. We design three diagnostic tasks: (i) labeling syntactic information at the sentence level - identifying subjects, objects, and indirect objects; (ii) derivational decomposition at the word level - identifying morpheme boundaries and labeling thedecomposed sequence; and (iii) in-depth study of morphological decomposition in German and Amharic. We evaluate prompting strategies in GPT-4o and LLaMA 3.3-70B to extract different types of linguistic structure for typologically diverse languages. Our results showthat GPT-4o consistently outperforms LLaMA in all tasks; however, both models exhibit limitations and show little evidence of abstract morphological rule learning. Importantly, we show strong evidence that the models fail to learn underlying morphological structures. Therefore,raising important doubts about their ability to generalize.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Interdisciplinary and Natural Language Processing

🧭 Keyword Pioneer — derivational knowledge

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tsedeniya Kinfe Temesgen , Marion Di Marco , Alexander Fraser

Topics

Artificial Intelligence > Core AI > Interpretability Natural Language Processing > Understanding > Syntax Natural Language Processing > Resources & Methods > Large Language Models Interdisciplinary > Linguistics > Computational Linguistics Artificial Intelligence > Core AI > Large Language Models Natural Language Processing > Understanding > Morphology

Keywords

syntactic parsing syntactic analysis linguistic knowledge morphological analysis derivational morphology syntactic relation large language model morphological decomposition derivational knowledge morphosyntactic competence morpheme boundary

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025