A Middle Path for On-Premises LLM Deployment: Preserving Privacy Without Sacrificing Model Confidentiality

Hanbo Huang; Yihan Li; Bowen Jiang; Bo Jiang; Lin Liu; Zhuotao Liu; Ruoyu Sun; SHIYU LIANG

2025 EMNLP EMNLP 2025

A Middle Path for On-Premises LLM Deployment: Preserving Privacy Without Sacrificing Model Confidentiality

Abstract

AbstractPrivacy-sensitive users require deploying large language models (LLMs) within their own infrastructure (on-premises) to safeguard private data and enable customization. However, vulnerabilities in local environments can lead to unauthorized access and potential model theft. To address this, prior research on small models has explored securing only the output layer within hardware-secured devices to balance model confidentiality and customization. Yet this approach fails to protect LLMs effectively. In this paper, we discover that (1) query-based distillation attacks targeting the secured top layer can produce a functionally equivalent replica of the victim model; (2) securing the same number of layers, bottom layers before a transition layer provide stronger protection against distillation attacks than top layers, with comparable effects on customization performance; and (3) the number of secured layers creates a trade-off between protection and customization flexibility. Based on these insights, we propose SOLID, a novel deployment framework that secures a few bottom layers in a secure environment and introduces an efficient metric to optimize the trade-off by determining the ideal number of hidden layers. Extensive experiments on five models (1.3B to 70B parameters) demonstrate that SOLID outperforms baselines, achieving a better balance between protection and downstream customization.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Security & Privacy

🧭 Keyword Pioneer — on-premises deployment

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hanbo Huang , Yihan Li , Bowen Jiang , Bo Jiang , Lin Liu , Zhuotao Liu , Ruoyu Sun , SHIYU LIANG

Topics

Machine Learning > Application Areas > Knowledge Distillation Machine Learning > Application Areas > Privacy Machine Learning > Application Areas > Model Compression Security & Privacy > Privacy Artificial Intelligence > Core AI > Privacy Machine Learning > Learning Types > Knowledge Distillation Deep Learning > Optimization & Theory > Model Compression

Keywords

knowledge distillation privacy preservation large language model on-premises deployment model confidentiality distillation attack layer security

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025