Adaptive Differential Privacy for Language Model Training

Xinwei Wu; Li Gong; Deyi Xiong

2022 ACL ACL 2022

Adaptive Differential Privacy for Language Model Training

Abstract

AbstractAlthough differential privacy (DP) can protect language models from leaking privacy, its indiscriminative protection on all data points reduces its practical utility. Previous works improve DP training by discriminating privacy and non-privacy data. But these works rely on datasets with prior privacy information, which is not available in real-world scenarios. In this paper, we propose an Adaptive Differential Privacy (ADP) framework for language modeling without resorting to prior privacy information. We estimate the probability that a linguistic item contains privacy based on a language model. We further propose a new Adam algorithm that adjusts the degree of differential privacy noise injected to the language model according to the estimated privacy probabilities. Experiments demonstrate that our ADP improves differentially private language modeling to achieve good protection from canary attackers.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing and Security & Privacy

🧭 Keyword Pioneer — canary attack

🐣 Hot Topic Early Bird — privacy protection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio