PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization

Sanae Lotfi; Marc Finzi; Sanyam Kapoor; Andres Potapczynski; Micah Goldblum; Andrew G Wilson

2022 NIPS NeurIPS 2022

PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization

Abstract

While there has been progress in developing non-vacuous generalization bounds for deep neural networks, these bounds tend to be uninformative about why deep learning works. In this paper, we develop a compression approach based on quantizing neural network parameters in a linear subspace, profoundly improving on previous results to provide state-of-the-art generalization bounds on a variety of tasks, including transfer learning. We use these tight bounds to better understand the role of model size, equivariance, and the implicit biases of optimization, for generalization in deep learning. Notably, we find large models can be compressed to a much greater extent than previously known, encapsulating Occam’s razor.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sanae Lotfi , Marc Finzi , Sanyam Kapoor , Andres Potapczynski , Micah Goldblum , Andrew G Wilson

Topics

Machine Learning > Optimization & Theory > Bayesian Inference Machine Learning > Optimization & Theory > Learning Theory Deep Learning > Optimization & Theory > Theory Machine Learning > Optimization & Theory > Generalization

Keywords

neural network compression transfer learning generalization bound pac-bayes bound neural network occams razor

Download PDF

Related papers

Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching 2022

A Theoretical View on Sparsely Activated Networks 2022

Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks 2022

Matryoshka Representation Learning 2022

Off-Policy Evaluation with Deficient Support Using Side Information 2022