Style Vectors for Steering Generative Large Language Models

Kai Konen; Sophie Jentzsch; Diaoulé Diallo; Peer Schütt; Oliver Bensch; Roxanne El Baff; Dominik Opitz; Tobias Hecking

2024 EACL EACL 2024

Style Vectors for Steering Generative Large Language Models

Abstract

AbstractThis research explores strategies for steering the output of large language models (LLMs) towards specific styles, such as sentiment, emotion, or writing style, by adding style vectors to the activations of hidden layers during text generation. We show that style vectors can be simply computed from recorded layer activations for input texts in a specific style in contrast to more complex training-based approaches. Through a series of experiments, we demonstrate the effectiveness of activation engineering using such style vectors to influence the style of generated text in a nuanced and parameterisable way, distinguishing it from prompt engineering. The presented research constitutes a significant step towards developing more adaptive and effective AI-empowered interactive systems.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kai Konen , Sophie Jentzsch , Diaoulé Diallo , Peer Schütt , Oliver Bensch , Roxanne El Baff , Dominik Opitz , Tobias Hecking

Topics

Artificial Intelligence > Core AI > Model Compression Natural Language Processing > Resources & Methods > Large Language Models

Keywords

sentiment analysis style transfer text generation parameter efficient activation engineering

Download PDF

Related papers

A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry 2024

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation 2024

Overview of the Hate Speech Detection in Turkish and Arabic Tweets (HSD-2Lang) Shared Task at CASE 2024 2024

Evaluating In-Context Learning for Computational Literary Studies: A Case Study Based on the Automatic Recognition of Knowledge Transfer in German Drama 2024

Selam@DravidianLangTech 2024:Identifying Hate Speech and Offensive Language 2024