2023 ICML ICML 2023

Tight Data Access Bounds for Private Top-$k$ Selection

Abstract

We study the top-$k$ selection problem under the differential privacy model: $m$ items are rated according to votes of a set of clients. We consider a setting in which algorithms can retrieve data via a sequence of accesses, each either a random access or a sorted access; the goal is to minimize the total number of data accesses. Our algorithm requires only $O(\sqrt{mk})$ expected accesses: to our knowledge, this is the first sublinear data-access upper bound for this problem. Our analysis also shows that the well-known exponential mechanism requires only $O(\sqrt{m})$ expected accesses. Accompanying this, we develop the first lower bounds for the problem, in three settings: only random accesses; only sorted accesses; a sequence of accesses of either kind. We show that, to avoid $\Omega(m)$ access cost, supporting both kinds of access is necessary, and that in this case our algorithm’s access cost is optimal.

🧭 Keyword Pioneer — data access bound
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy
🌉 Interdisciplinary Bridge — Machine Learning and Security & Privacy