Hopscotch: Discovering and Skipping Redundancies in Language Models

Mustafa Eyceoz; Nikhil Shivakumar Nayak; Hao Wang; Ligong Han; Akash Srivastava

2025 EMNLP EMNLP 2025

Hopscotch: Discovering and Skipping Redundancies in Language Models

Abstract

AbstractModern causal language models stack many attention blocks to improve performance, but not all blocks are necessary for every task. We propose Hopscotch, a simple yet effective method that identifies and skips attention blocks with least contributions to a task and adapts to preserve output quality. Hopscotch jointly optimizes which blocks to skip and how to scale the outputs of the remaining layers. By introducing lightweight, trainable scaling parameters to attention and MLP blocks, it mitigates distribution shifts in hidden states caused by removing attention blocks. Hopscotch does not modify model weights or require access to pretraining or instruction-tuning data, and is compatible with existing model compression techniques. When applied to Llama-3.1-8B and Qwen-2.5-7B, Hopscotch achieves less than a 2% drop in performance even after skipping four attention blocks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning

🧭 Keyword Pioneer — attention block

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Mustafa Eyceoz , Nikhil Shivakumar Nayak , Hao Wang , Ligong Han , Akash Srivastava

Topics

Artificial Intelligence > Core AI > Model Compression Deep Learning > Techniques > Model Architecture

Keywords

model compression hidden state language model layer skipping attention block

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025