Learning Curves for Deep Structured Gaussian Feature Models

Jacob Zavatone-Veth; Cengiz Pehlevan

2023 NIPS NeurIPS 2023

Learning Curves for Deep Structured Gaussian Feature Models

Abstract

In recent years, significant attention in deep learning theory has been devoted to analyzing when models that interpolate their training data can still generalize well to unseen examples. Many insights have been gained from studying models with multiple layers of Gaussian random features, for which one can compute precise generalization asymptotics. However, few works have considered the effect of weight anisotropy; most assume that the random features are generated using independent and identically distributed Gaussian weights, and allow only for structure in the input data. Here, we use the replica trick from statistical physics to derive learning curves for models with many layers of structured Gaussian features. We show that allowing correlations between the rows of the first layer of features can aid generalization, while structure in later layers is generally detrimental. Our results shed light on how weight structure affects generalization in a simple class of solvable models.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — gaussian random feature

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jacob Zavatone-Veth , Cengiz Pehlevan

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Statistical Learning Mathematics & Optimization > Mathematics > Probability Machine Learning > Bayesian & Probabilistic > Gaussian Processes Deep Learning > Optimization & Theory > Theory

Keywords

statistical learning deep learning theory gaussian process feature models generalization bound learning curve statistical physics gaussian random feature

Download PDF

Related papers

Risk-Averse Model Uncertainty for Distributionally Robust Safe Reinforcement Learning 2023

Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport 2023

Self-Supervised Motion Magnification by Backpropagating Through Optical Flow 2023

Diffused Task-Agnostic Milestone Planner 2023

Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and Beyond 2023