Does Large Language Model Contain Task-Specific Neurons?

Ran Song; Shizhu He; Shuting Jiang; Yantuan Xian; Shengxiang Gao; Kang Liu; Zhengtao Yu

2024 EMNLP EMNLP 2024

Does Large Language Model Contain Task-Specific Neurons?

Abstract

AbstractLarge language models (LLMs) have demonstrated remarkable capabilities in comprehensively handling various types of natural language processing (NLP) tasks. However, there are significant differences in the knowledge and abilities required for different tasks. Therefore, it is important to understand whether the same LLM processes different tasks in the same way. Are there specific neurons in a LLM for different tasks? Inspired by neuroscience, this paper pioneers the exploration of whether distinct neurons are activated when a LLM handles different tasks. Compared with current research exploring the neurons of language and knowledge, task-specific neurons present a greater challenge due to their abstractness, diversity, and complexity. To address these challenges, this paper proposes a method for task-specific neuron localization based on Causal Gradient Variation with Special Tokens (CGVST). CGVST identifies task-specific neurons by concentrating on the most significant tokens during task processing, thereby eliminating redundant tokens and minimizing interference from non-essential neurons. Compared to traditional neuron localization methods, our approach can more effectively identify task-specific neurons. We conduct experiments across eight different public tasks. Experiments involving the inhibition and amplification of identified neurons demonstrate that our method can accurately locate task-specific neurons.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🧭 Keyword Pioneer — task-specific neuron

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ran Song , Shizhu He , Shuting Jiang , Yantuan Xian , Shengxiang Gao , Kang Liu , Zhengtao Yu

Topics

Artificial Intelligence > Core AI > Foundation Models Artificial Intelligence > Core AI > Interpretability Deep Learning > Architectures > Transformers Deep Learning > Architectures > Neural Networks Machine Learning > Core Methods > Feature Learning Artificial Intelligence > Core AI > Large Language Models Deep Learning > Optimization & Theory > Theory Machine Learning > Core Methods > Interpretability

Keywords

feature learning neural network analysis neural network activation causal analysis large language model task-specific neuron neuron localization llm interpretability causal gradient variation mechanism interpretability

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024