HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM

Zhilin Wang; Yi Dong; Jiaqi Zeng; Virginia Adams; Makesh Narsimhan Sreedhar; Daniel Egert; Olivier Delalleau; Jane Scowcroft; Neel Kant; Aidan Swope; Oleksii Kuchaiev

2024 NAACL NAACL 2024

HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM

Abstract

AbstractExisting open-source helpfulness preference datasets do not specify what makes some responses more helpful and others less so. Models trained on these datasets can incidentally learn to model dataset artifacts (e.g. preferring longer but unhelpful responses only due to their length). To alleviate this problem, we collect HelpSteer, a multi-attribute helpfulness dataset annotated for the various aspects that make responses helpful. Specifically, our 37k-sample dataset has annotations for correctness, coherence, complexity, and verbosity in addition to overall helpfulness of responses. Training Llama 2 70B using the HelpSteer dataset with SteerLM technique produces a model that scores 7.54 on MT Bench, which is currently the highest score for open models that do not require training data from more powerful models (e.g. GPT-4). We release this dataset with CC-BY-4.0 license at https://huggingface.co/datasets/nvidia/HelpSteer

🧭 Keyword Pioneer — multi-attribute annotation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zhilin Wang , Yi Dong , Jiaqi Zeng , Virginia Adams , Makesh Narsimhan Sreedhar , Daniel Egert , Olivier Delalleau , Jane Scowcroft , Neel Kant , Aidan Swope , Oleksii Kuchaiev

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Weakly Supervised Learning Machine Learning > Application Areas > Data Augmentation

Keywords

reward modeling preference learning language model supervised fine-tuning multi-attribute annotation

Download PDF

Related papers

Working Alliance Transformer for Psychotherapy Dialogue Classification 2024

Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences 2024

Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study 2024

TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation 2024

Extractive Summarization with Text Generator 2024