Prototypical Reward Network for Data-Efficient RLHF

Jinghan Zhang; Xiting Wang; Yiqiao Jin; Changyu CHEN; Xinhao Zhang; Kunpeng Liu

2024 ACL ACL 2024

Prototypical Reward Network for Data-Efficient RLHF

Abstract

AbstractThe reward model for Reinforcement Learning from Human Feedback (RLHF) has proven effective in fine-tuning Large Language Models (LLMs). Notably, collecting human feedback for RLHF can be resource-intensive and lead to scalability issues for LLMs and complex tasks. Our proposed framework Proto-RM leverages prototypical networks to enhance reward models under limited human feedback. By enabling stable and reliable structural learning from fewer samples, Proto-RM significantly enhances LLMs' adaptability and accuracy in interpreting human preferences. Extensive experiments on various datasets demonstrate that Proto-RM significantly improves the performance of reward models and LLMs in human feedback tasks, achieving comparable and usually better results than traditional methods, while requiring significantly less data in data-limited scenarios. This research offers a promising direction for enhancing the efficiency of reward models and optimizing the fine-tuning of language models under restricted feedback conditions.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jinghan Zhang , Xiting Wang , Yiqiao Jin , Changyu CHEN , Xinhao Zhang , Kunpeng Liu

Topics

Artificial Intelligence > Core AI > Foundation Models Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Self-Supervised Learning Artificial Intelligence > Core AI > Large Language Models Deep Learning > Learning Types > Representation Learning Artificial Intelligence > Core AI > Reinforcement Learning

Keywords

representation learning few-shot learning reinforcement learning from human feedback reward model human preference prototypical network large language model

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024