MEET: Towards Memory-Efficient Temporal Sparse Deep Neural Networks

Zeqi Zhu; Ibrahim Batuhan Akkaya; Luc Waeijen; Egor Bondarev; Arash Pourtaherian; Orlando Moreira

2025 CVPR CVPR 2025

MEET: Towards Memory-Efficient Temporal Sparse Deep Neural Networks

Abstract

Deep Neural Networks (DNNs) are accurate but compute-intensive, leading to substantial energy consumption during inference. Exploiting temporal redundancy through \Delta-\Sigma convolution in video processing has proven to greatly enhance computation efficiency. However, temporal \Delta-\Sigma DNNs typically require substantial memory for storing neuron states to compute inter-frame differences, hindering their on-chip deployment. To mitigate this memory cost, directly compressing the states can disrupt the linearity of temporal \Delta-\Sigma convolution, causing accumulated errors in long-term \Delta-\Sigma processing. Thus, we propose MEET, an optimization framework for MEmory-Efficient Temporal \Delta-\Sigma DNNs. MEET transfers the state compression challenge to a well-established weight compression problem by trading fewer activations for more weights and introduces a co-design of network architecture and suppression method to optimize for mixed spatial-temporal execution. Evaluations on three vision applications demonstrate a reduction of 5.1~13.3 xin total memory compared to the most computation-efficient temporal DNNs, while preserving the computation efficiency and model accuracy in long-term \Delta-\Sigma processing. MEET facilitates the deployment of temporal \Delta-\Sigma DNNs within on-chip memory of embedded event-driven platforms, empowering low-power edge processing.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — delta-sigma convolution

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zeqi Zhu , Ibrahim Batuhan Akkaya , Luc Waeijen , Egor Bondarev , Arash Pourtaherian , Orlando Moreira

Topics

Machine Learning > Optimization & Theory > Optimization Machine Learning > Application Areas > Efficient Computing Deep Learning > Architectures > Neural Networks Machine Learning > Application Areas > Model Compression Artificial Intelligence > Core AI > Efficient Computing Deep Learning > Optimization & Theory > Optimization Deep Learning > Optimization & Theory > Model Compression Computer Vision > Core AI > Efficient Computing Deep Learning > Optimization & Theory > Efficient Computing

Keywords

model compression memory efficiency memory optimization edge computing edge deployment computation efficiency temporal processing model accuracy neural network temporal redundancy delta-sigma convolution temporal neural network sparse deep neural network

Download PDF

Related papers

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos 2025

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding 2025

FADE: Frequency-Aware Diffusion Model Factorization for Video Editing 2025

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning 2025

Reversible Decoupling Network for Single Image Reflection Removal 2025