Linear Decoding of Morphology Relations in Language Models (Student Abstract)

Eric Xia; Jugal Kalita

2025 AAAI AAAI 2025

Linear Decoding of Morphology Relations in Language Models (Student Abstract)

Abstract

Abstract The recent success of transformer language models owes much to their conversational fluency, which includes linguistic and morphological proficiency. An affine Taylor approximation has been found to be a good approximation for transformer computations over certain factual and encyclopedic relations. We show that the truly linear approximation W s, where s is a early layer representation of the base form and W is a local model derivative, is necessary and sufficient to approximate morphological derivation, achieving above 80% top-1 accuracy across most morphological tasks in the Bigger Analogy Test Set. We argue that many morphological forms in transformer models are likely linearly encoded.

🌉 Interdisciplinary Bridge — Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — affine taylor approximation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Eric Xia , Jugal Kalita

Topics

Machine Learning > Core Methods > Representation Learning Natural Language Processing > Understanding > Semantic Analysis Natural Language Processing > Resources & Methods > Large Language Models Interdisciplinary > Linguistics > Morphology Natural Language Processing > Resources & Methods > Language Modeling Deep Learning > Models > Transformers

Keywords

representation learning linear approximation language model linear probing affine taylor approximation

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025