Law of Large Numbers for Bayesian two-layer Neural Network trained with Variational Inference

Arnaud Descours; Tom Huix; Arnaud Guillin; Manon Michel; Eric Moulines; Boris Nectoux

2023 COLT COLT 2023

Law of Large Numbers for Bayesian two-layer Neural Network trained with Variational Inference

Abstract

We provide a rigorous analysis of training by variational inference (VI) of Bayesian neural networks in the two-layer and infinite-width case. We consider a regression problem with a regularized evidence lower bound (ELBO) which is decomposed into the expected log-likelihood of the data and the Kullback-Leibler (KL) divergence between the a priori distribution and the variational posterior. With an appropriate weighting of the KL, we prove a law of large numbers for three different training schemes: (i) the idealized case with exact estimation of a multiple Gaussian integral from the reparametrization trick, (ii) a minibatch scheme using Monte Carlo sampling, commonly known as Bayes by Backprop, and (iii) a new and computationally cheaper algorithm which we introduce as Minimal VI. An important result is that all methods converge to the same mean-field limit. Finally, we illustrate our results numerically and discuss the need for the derivation of a central limit theorem.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Arnaud Descours , Tom Huix , Arnaud Guillin , Manon Michel , Eric Moulines , Boris Nectoux

Topics

Machine Learning > Optimization & Theory > Bayesian Inference Deep Learning > Models > Variational Inference

Keywords

variational inference monte carlo sampling bayesian neural network evidence lower bound law of large number reparametrization trick

Download PDF

Related papers

Towards a Complete Analysis of Langevin Monte Carlo: Beyond Poincaré Inequality 2023

Improved Discretization Analysis for Underdamped Langevin Monte Carlo 2023

Convergence of AdaGrad for Non-convex Objectives: Simple Proofs and Relaxed Assumptions 2023

Stability and Generalization of Stochastic Optimization with Nonconvex and Nonsmooth Problems 2023

Online Learning and Solving Infinite Games with an ERM Oracle 2023