2025
EMNLP
EMNLP 2025
Sparse Activation Editing for Reliable Instruction Following in Narratives
Abstract
AbstractComplex narrative contexts often challenge language models’ ability to follow instructions, and existing benchmarks fail to capture these difficulties. To address this, we propose Concise-SAE, a training-free framework that improves instruction following by identifying and editing instruction-relevant neurons using only natural language instructions, without requiring labelled data. To thoroughly evaluate our method, we introduce FreeInstruct, a diverse and realistic benchmark that highlights the challenges of instruction following in narrative-rich settings. While initially motivated by complex narratives, Concise-SAE demonstrates state-of-the-art instruction adherence across varied tasks without compromising generation quality.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— sparse activation editing
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Runcong Zhao
,
Chengyu Cao
,
Qinglin Zhu
,
Xiucheng Ly
,
Shun Shao
,
Lin Gui
,
Ruifeng Xu
,
Yulan He
Topics
Artificial Intelligence > Core AI > Interpretability
Deep Learning > Architectures > Transformers
Natural Language Processing > Resources & Methods > Large Language Models
Artificial Intelligence > Core AI > Large Language Models
Machine Learning > Learning Types > In-Context Learning
Deep Learning > Learning Types > Representation Learning
Machine Learning > Learning Types > Interpretability