Defense Against Model Stealing Based on Account-Aware Distribution Discrepancy

Jian-Ping Mei; Weibin Zhang; Jie Chen; Xuyun Zhang; Tiantian Zhu

2025 AAAI AAAI 2025

Defense Against Model Stealing Based on Account-Aware Distribution Discrepancy

Abstract

Abstract Malicious users attempt to replicate commercial models functionally at low cost by training a clone model with query responses. It is challenging to timely prevent such model-stealing attacks to achieve strong protection and maintain utility. In this paper, we propose a novel non-parametric detector called Account-aware Distribution Discrepancy (ADD) to recognize queries from malicious users by leveraging account-wise local dependency. We formulate each class as a Multivariate Normal distribution (MVN) in the feature space and measure the malicious score as the sum of weighted class-wise distribution discrepancy. The ADD detector is combined with random-based prediction poisoning to yield a plug-and-play defense module named D-ADD for image classification models. Results of extensive experimental studies show that D-ADD achieves strong defense against different types of attacks with little interference in serving benign users for both soft and hard-label settings.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Security & Privacy

🧭 Keyword Pioneer — prediction poisoning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Jian-Ping Mei , Weibin Zhang , Jie Chen , Xuyun Zhang , Tiantian Zhu

Topics

Machine Learning > Learning Types > Adversarial Learning Machine Learning > Application Areas > Privacy Security & Privacy > Privacy Deep Learning > Learning Types > Adversarial Learning Machine Learning > Learning Types > Robustness

Keywords

adversarial defense model stealing distribution discrepancy multivariate normal distribution clone detection prediction poisoning

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025