Sparsity-Preserving Differentially Private Training of Large Embedding Models

Badih Ghazi; Yangsibo Huang; Pritish Kamath; Ravi Kumar; Pasin Manurangsi; Amer Sinha; Chiyuan Zhang

2023 NIPS NeurIPS 2023

Sparsity-Preserving Differentially Private Training of Large Embedding Models

Abstract

As the use of large embedding models in recommendation systems and language applications increases, concerns over user data privacy have also risen. DP-SGD, a training algorithm that combines differential privacy with stochastic gradient descent, has been the workhorse in protecting user privacy without compromising model accuracy by much. However, applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency. To address this issue, we present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during the private training of large embedding models. Our algorithms achieve substantial reductions ($10^6 \times$) in gradient size, while maintaining comparable levels of accuracy, on benchmark real-world datasets.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐣 Hot Topic Early Bird — embedding model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Badih Ghazi , Yangsibo Huang , Pritish Kamath , Ravi Kumar , Pasin Manurangsi , Amer Sinha , Chiyuan Zhang

Topics

Artificial Intelligence > Core AI > Model Compression Machine Learning > Application Areas > Privacy

Keywords

differential privacy model training embedding model gradient sparsity

Download PDF

Related papers

Risk-Averse Model Uncertainty for Distributionally Robust Safe Reinforcement Learning 2023

Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport 2023

Self-Supervised Motion Magnification by Backpropagating Through Optical Flow 2023

Diffused Task-Agnostic Milestone Planner 2023

Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and Beyond 2023