Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

Peiyu Liu; Ze-Feng Gao; Wayne Xin Zhao; Zhi-Yuan Xie; Zhong-Yi Lu; Ji-Rong Wen

2021 IJCNLP IJCNLP 2021

Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

Abstract

AbstractThis paper presents a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics. It can decompose an original matrix into central tensors (containing the core information) and auxiliary tensors (with only a small proportion of parameters). With the decomposed MPO structure, we propose a novel fine-tuning strategy by only updating the parameters from the auxiliary tensors, and design an optimization algorithm for MPO-based approximation over stacked network architectures. Our approach can be applied to the original or the compressed PLMs in a general way, which derives a lighter network and significantly reduces the parameters to be fine-tuned. Extensive experiments have demonstrated the effectiveness of the proposed approach in model compression, especially the reduction in fine-tuning parameters (91% reduction on average). The code to reproduce the results of this paper can be found at https://github.com/RUCAIBox/MPOP.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🐣 Hot Topic Early Bird — parameter-efficient fine-tuning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio