Real-Time Video Inference on Edge Devices via Adaptive Model Streaming

Mehrdad Khani; Pouya Hamadanian; Arash Nasr-Esfahany; Mohammad Alizadeh

2021 ICCV ICCV 2021

Real-Time Video Inference on Edge Devices via Adaptive Model Streaming

Abstract

Real-time video inference on edge devices like mobile phones and drones is challenging due to the high computation cost of Deep Neural Networks. We present Adaptive Model Streaming (AMS), a new approach to improving the performance of efficient lightweight models for video inference on edge devices. AMS uses a remote server to continually train and adapt a small model running on the edge device, boosting its performance on the live video using online knowledge distillation from a large, state-of-the-art model. We discuss the challenges of over-the-network model adaptation for video inference and present several techniques to reduce communication the cost of this approach: avoiding excessive overfitting, updating a small fraction of important model parameters, and adaptive sampling of training frames at edge devices. On the task of video semantic segmentation, our experimental results show 0.4--17.8 percent mean Intersection-over-Union improvement compared to a pre-trained model across several video datasets. Our prototype can perform video segmentation at 30 frames-per-second with 40 milliseconds camera-to-label latency on a Samsung Galaxy S10+ mobile phone, using less than 300 Kbps uplink and downlink bandwidth on the device.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — model streaming

🐣 Hot Topic Early Bird — model adaptation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Mehrdad Khani , Pouya Hamadanian , Arash Nasr-Esfahany , Mohammad Alizadeh

Topics

Machine Learning > Application Areas > Efficient Computing Deep Learning > Techniques > Model Architecture Computer Vision > Processing > Video Processing Computer Vision > Processing > Semantic Segmentation Deep Learning > Techniques > Knowledge Distillation

Keywords

online learning video segmentation knowledge distillation model adaptation edge computing real-time inference video semantic segmentation model streaming

Download PDF

Related papers

Spatial-Temporal Transformer for Dynamic Scene Graph Generation 2021

ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators 2021

A Broad Study on the Transferability of Visual Representations With Contrastive Learning 2021

Query Adaptive Few-Shot Object Detection With Heterogeneous Graph Convolutional Networks 2021

Self-Supervised Neural Networks for Spectral Snapshot Compressive Imaging 2021