Learning to Compose Representations of Different Encoder Layers towards Improving Compositional Generalization

Lei Lin; Shuangtao Li; Yafang Zheng; Biao Fu; Shan Liu; Yidong Chen; Xiaodong Shi

2023 EMNLP EMNLP 2023

Learning to Compose Representations of Different Encoder Layers towards Improving Compositional Generalization

Abstract

AbstractRecent studies have shown that sequence-to-sequence (seq2seq) models struggle with compositional generalization (CG), i.e., the ability to systematically generalize to unseen compositions of seen components. There is mounting evidence that one of the reasons hindering CG is the representation of the encoder uppermost layer is entangled, i.e., the syntactic and semantic representations of sequences are entangled. However, we consider that the previously identified representation entanglement problem is not comprehensive enough. Additionally, we hypothesize that the source keys and values representations passing into different decoder layers are also entangled. Starting from this intuition, we propose CompoSition (Compose Syntactic and Semantic Representations), an extension to seq2seq models which learns to compose representations of different encoder layers dynamically for different tasks, since recent studies reveal that the bottom layers of the Transformer encoder contain more syntactic information and the top ones contain more semantic information. Specifically, we introduce a composed layer between the encoder and decoder to compose different encoder layers’ representations to generate specific keys and values passing into different decoder layers. CompoSition achieves competitive results on two comprehensive and realistic benchmarks, which empirically demonstrates the effectiveness of our proposal. Codes are available at https://github.com/thinkaboutzero/COMPOSITION.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — encoder representation composition

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Lei Lin , Shuangtao Li , Yafang Zheng , Biao Fu , Shan Liu , Yidong Chen , Xiaodong Shi

Topics

Machine Learning > Core Methods > Representation Learning Deep Learning > Architectures > Transformers Natural Language Processing > Generation > Language Modeling Natural Language Processing > Generation > Text Generation

Keywords

representation learning semantic representation compositional generalization transformer encoder sequence-to-sequence model syntactic representation encoder representation composition transformer encoder layer syntax-semantic disentanglement

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023