Preserving Privacy Through Dememorization: An Unlearning Technique For Mitigating Memorization Risks In Language Models

Aly Kassem; Omar Mahmoud; Sherif Saad

2023 EMNLP EMNLP 2023

Preserving Privacy Through Dememorization: An Unlearning Technique For Mitigating Memorization Risks In Language Models

Abstract

AbstractLarge Language models (LLMs) are trained on vast amounts of data, including sensitive information that poses a risk to personal privacy if exposed. LLMs have shown the ability to memorize and reproduce portions of their training data when prompted by adversaries. Prior research has focused on addressing this memorization issue and preventing verbatim replication through techniques like knowledge unlearning and data pre-processing. However, these methods have limitations regarding the number of protected samples, limited privacy types, and potentially lower-quality generative models. To tackle this challenge more effectively, we propose “DeMem,” a novel unlearning approach that utilizes an efficient reinforcement learning feedback loop via proximal policy optimization. By fine-tuning the language model with a negative similarity score as a reward signal, we incentivize the LLMs to learn a paraphrasing policy to unlearn the pre-training data. Our experiments demonstrate that DeMem surpasses strong baselines and state-of-the-art methods in terms of its ability to generalize and strike a balance between maintaining privacy and LLM performance.

🌱 Topic Pioneer — Machine Unlearning

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Reinforcement Learning

📈 Trend Setter — Machine Unlearning

🧭 Keyword Pioneer — reinforcement learning feedback

🐣 Hot Topic Early Bird — proximal policy optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Aly Kassem , Omar Mahmoud , Sherif Saad

Topics

Artificial Intelligence > Core AI > AI Safety Machine Learning > Application Areas > Privacy Reinforcement Learning > Methods > Policy Learning Artificial Intelligence > Core AI > Privacy Deep Learning > Learning Types > Reinforcement Learning Machine Learning > Learning Types > Machine Unlearning Deep Learning > Learning Types > Machine Unlearning

Keywords

reinforcement learning privacy preservation machine unlearning language model proximal policy optimization reinforcement learning feedback memorization mitigation unlearning technique language model memorization paraphrasing policy memorization risk

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023