Prompt Learning with Extended Kalman Filter for Pre-trained Language Models

Quan Li; Xike Xie; Chao Wang; S. Kevin Zhou

2024 IJCAI IJCAI 2024

Prompt Learning with Extended Kalman Filter for Pre-trained Language Models

Abstract

Prompt learning has gained popularity as a means to leverage the knowledge embedded in pre-trained language models (PLMs) for NLP tasks while using a limited number of trainable parameters. While it has shown promise in tasks like sentiment classification and natural language inference, generating suitable prompts for PLMs, as opposed to human prompts, remains a challenge. In this paper, we introduce an abstraction of the prompt learning process using an extended Kalman filter. Our approach, called Conditional Extended Kalman Filter based on Neural Networks (CEKFNN), effectively infers more appropriate prompt tokens by enhancing the classic extended Kalman filter with PLM's contextual representation power. Specifically, CEKFNN learns transition and emission functions from PLM embeddings of input sentences to infer latent prompt tokens. We refine CEKFNN using an alternate-training approach, retraining a PLM's emission function with prompt tokens inferred by prompt models (PMs), as well as the initial and transition functions. PLM's output labels assist in PMs' training. When updating the pre-trained language model (PLM), we use an adapter approach with few trainable parameters, leaving PLM parameters frozen. We evaluate CEKFNN across open-source PLMs, demonstrating performance improvements over state-of-the-art methods while using a limited number of trainable parameters. It shows that CEKFNN performs on-par or better than fine-tuning, which requires updating all parameters in the PLM.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Quan Li , Xike Xie , Chao Wang , S. Kevin Zhou

Topics

Machine Learning > Optimization & Theory > Bayesian Inference Machine Learning > Application Areas > Efficient Computing Deep Learning > Techniques > Pretraining Natural Language Processing > Resources & Methods > Large Language Models Machine Learning > Bayesian & Probabilistic > Bayesian Inference Machine Learning > Learning Types > Prompt Engineering

Keywords

natural language inference prompt learning extended kalman filter latent variable parameter-efficient fine-tuning pre-trained language model

Download PDF

Related papers

Langshaw: Declarative Interaction Protocols Based on Sayso and Conflict 2024

A Successful Strategy for Multichannel Iterated Prisoner’s Dilemma 2024

Bring Metric Functions into Diffusion Models 2024

Fast One-Stage Unsupervised Domain Adaptive Person Search 2024

FreqFormer: Frequency-aware Transformer for Lightweight Image Super-resolution 2024