TextMixer: Mixing Multiple Inputs for Privacy-Preserving Inference

Xin Zhou; Yi Lu; Ruotian Ma; Tao Gui; Qi Zhang; Xuanjing Huang

2023 EMNLP EMNLP 2023

TextMixer: Mixing Multiple Inputs for Privacy-Preserving Inference

Abstract

AbstractPre-trained language models (PLMs) are often deployed as cloud services, enabling users to upload textual data and perform inference remotely. However, users’ personal text often contains sensitive information, and sharing such data directly with the service providers can lead to serious privacy leakage. To address this problem, we introduce a novel privacy-preserving inference framework called MixPi , which prevents plaintext leakage during the inference phase. Inspired by k-anonymity, MixPi aims to obfuscate a user’s private input by mixing it with multiple other inputs, thereby confounding potential privacy attackers. To achieve this, our approach involves: (1) proposing a novel encryption module, Privacy Mixer, which encrypts input from three distinct dimensions: mixing, representation, and position. (2) adopting a pre-trained Multi-input Multi-output network to handle mixed representations and obtain multiple predictions. (3) employing a Privacy Demixer to ensure only the user can decrypt the real output among the multiple predictions. Furthermore, we explore different ways to automatically generate synthetic inputs required for mixing. Experimental results on token and sentence classification tasks demonstrate that MixPi greatly surpasses existing privacy-preserving methods in both performance and privacy.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing and Security & Privacy

🧭 Keyword Pioneer — input mixing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xin Zhou , Yi Lu , Ruotian Ma , Tao Gui , Qi Zhang , Xuanjing Huang

Topics

Machine Learning > Application Areas > Privacy Natural Language Processing > Resources & Methods > Large Language Models Security & Privacy > Privacy Deep Learning > Learning Types > Transfer Learning

Keywords

natural language processing text classification pre-trained language model sentence classification privacy-preserving inference large language model input mixing privacy preserving inference input obfuscation

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023