(Nearly) Optimal Private Linear Regression for Sub-Gaussian Data via Adaptive Clipping

Prateek Varshney; Abhradeep Thakurta; Prateek Jain

2022 COLT COLT 2022

(Nearly) Optimal Private Linear Regression for Sub-Gaussian Data via Adaptive Clipping

Abstract

We study the problem of differentially private linear regression where each of the data point is sampled from a fixed sub-Gaussian style distribution. We propose and analyze a one-pass mini-batch stochastic gradient descent method (DP-AMBSSGD) where points in each iteration are sampled without replacement. Noise is added for DP but the noise standard deviation is estimated online. Compared to existing $(\epsilon, \delta)$-DP techniques which have sub-optimal error bounds, DP-AMBSSGD is able to provide nearly optimal error bounds in terms of key parameters like dimensionality $d$, number of points $N$, and the standard deviation \sigma of the noise in observations. For example, when the $d$-dimensional covariates are sampled i.i.d. from the normal distribution, then the excess error of DP-AMBSSGD due to privacy is $\sigma^2 d/N(1+d/(\epsilon^2 N))$, i.e., the error is meaningful when number of samples N\geq d \log d which is the standard operative regime for linear regression. In contrast, error bounds for existing efficient methods in this setting are: $d^3/(\epsilon^2 N^2)$, even for $\sigma=0$. That is, for constant $\epsilon$, the existing techniques require $N=d^{1.5}$ to provide a non-trivial result.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Prateek Varshney , Abhradeep Thakurta , Prateek Jain

Topics

Machine Learning > Core Methods > Regression Machine Learning > Application Areas > Privacy Machine Learning > Learning Types > Regression

Keywords

differential privacy stochastic gradient descent linear regression sub-gaussian distribution adaptive clipping

Download PDF

Related papers

Non-Convex Optimization with Certificates and Fast Rates Through Kernel Sums of Squares 2022

Analysis of Langevin Monte Carlo from Poincare to Log-Sobolev 2022

Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization under Infinite Noise Variance 2022

Tight query complexity bounds for learning graph partitions 2022

Pushing the Efficiency-Regret Pareto Frontier for Online Learning of Portfolios and Quantum States 2022