Joint Inference for Neural Network Depth and Dropout Regularization

Kishan K C; Rui Li; MohammadMahdi Gilany

2021 NIPS NeurIPS 2021

Joint Inference for Neural Network Depth and Dropout Regularization

Abstract

Dropout regularization methods prune a neural network's pre-determined backbone structure to avoid overfitting. However, a deep model still tends to be poorly calibrated with high confidence on incorrect predictions. We propose a unified Bayesian model selection method to jointly infer the most plausible network depth warranted by data, and perform dropout regularization simultaneously. In particular, to infer network depth we define a beta process over the number of hidden layers which allows it to go to infinity. Layer-wise activation probabilities induced by the beta process modulate neuron activation via binary vectors of a conjugate Bernoulli process. Experiments across domains show that by adapting network depth and dropout regularization to data, our method achieves superior performance comparing to state-of-the-art methods with well-calibrated uncertainty estimates. In continual learning, our method enables neural networks to dynamically evolve their depths to accommodate incrementally available data beyond their initial structures, and alleviate catastrophic forgetting.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kishan K C , Rui Li , MohammadMahdi Gilany

Topics

Machine Learning > Learning Types > Continual Learning Machine Learning > Optimization & Theory > Bayesian Inference Machine Learning > Optimization & Theory > Stochastic Processes Machine Learning > Bayesian & Probabilistic > Bayesian Inference

Keywords

neural network architecture continual learning model selection bayesian inference uncertainty quantification bayesian model selection stochastic process beta process dropout regularization neural network depth

Download PDF

Related papers

Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data 2021

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation 2021

Test-Time Personalization with a Transformer for Human Pose Estimation 2021

NTopo: Mesh-free Topology Optimization using Implicit Neural Representations 2021

Scalable Intervention Target Estimation in Linear Models 2021