DimA: A Parameter-efficient Fine-tuning Method with Knowledge Transfer Based on Transformer

Wenxuan Zhang; Min Huang; Zhuoyang Song; Qinghai Miao

2024 COLING COLING 2024

DimA: A Parameter-efficient Fine-tuning Method with Knowledge Transfer Based on Transformer

Abstract

AbstractFine-tuning is a widely used technique for leveraging pre-trained language models (PLMs) in downstream tasks, but it can be computationally expensive and storage-intensive. To address this challenge, researchers have developed parameter-efficient methods that balance performance and resource cost. However, these methods often come with trade-offs like increased inference latency, token length usage, or limited adaptability for multitasking scenarios. This paper introduces a novel parameter-efficient method called DimA(Dimensionality Augmentation), which enhances the Transformer architecture by increasing the dimensionality. DimA achieves state-of-the-art results in GLUE and XSUM tasks while utilizing less than 1% of the original model’s parameters. Moreover, DimA introduces a novel approach to knowledge transfer that enables the simultaneous utilization of knowledge learned from multiple tasks to handle new tasks. This method significantly enhances the performance of the model on new tasks. Its versatility in model structure also enables its application to various Transformer-based models.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio