ReLUPruner: Rethinking ReLU Importance with Taylor Expansion for Efficient Private Inference

Zhenpeng Li; Jinshuo Liu; Xinyan Wang; Lina Wang; Jeff Z. Pan

2026 AAAI AAAI 2026

ReLUPruner: Rethinking ReLU Importance with Taylor Expansion for Efficient Private Inference

Abstract

Abstract With the growing adoption of Machine-Learning-As-A-Service (MLaaS), Private Inference (PI) has emerged as a promising solution to address its security concerns through cryptographic techniques. However, nonlinear operations in neural networks account for most of the computational and communication overhead in PI. Existing studies mainly focus on optimizing and reducing the number of ReLU activations in neural networks, but traditional pruning methods may mistakenly remove ReLUs that are critical to maintaining model accuracy. To accurately evaluate the importance of ReLUs in the network, we propose ReLUPruner, a method that uses Taylor expansion to quantify the impact on loss before and after ReLU replacement. Furthermore, we establish a hierarchical importance metric to guide layer-wise ReLU budget allocation and adopt a progressive pruning strategy that dynamically adjust the pruning rate of each layer according to training progress. Extensive experiments on various models and datasets show that ReLUPruner achieves a good balance between ReLU budget and model accuracy, yielding improvements of 1.89% (12.9k ReLUs, CIFAR-10), 3.62% (50k ReLUs, CIFAR-100) and 2.66% (30k ReLUs, Tiny-ImageNet) over the previous state-of-the-art.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zhenpeng Li , Jinshuo Liu , Xinyan Wang , Lina Wang , Jeff Z. Pan

Topics

Machine Learning > Application Areas > Efficient Computing Machine Learning > Application Areas > Privacy Deep Learning > Techniques > Model Architecture

Keywords

model compression neural network pruning relu activation taylor expansion private inference

Download PDF

Related papers

Hi-EF: Benchmarking Emotion Forecasting in Human-interaction 2026

MosaicDoc: A Large-Scale Bilingual Benchmark for Visually Rich Document Understanding 2026

Sparse3DPR: Training-Free 3D Hierarchical Scene Parsing and Task-Adaptive Subgraph Reasoning from Sparse RGB Views 2026

LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning 2026

HDGS: Hierarchical Dynamic Gaussian Splatting for Urban Driving Scenes 2026