Hypernetwork-based Implicit Posterior Estimation and Model Averaging of CNN

Kenya Ukai; Takashi Matsubara; Kuniaki Uehara

2018 ACML ACML 2018

Hypernetwork-based Implicit Posterior Estimation and Model Averaging of CNN

Abstract

Deep neural networks have a rich ability to learn complex representations and achieved remarkable results in various tasks. However, they are prone to overfitting due to the limited number of training samples; regularizing the learning process of neural networks is critical. In this paper, we propose a novel regularization method, which estimates parameters of a large convolutional neural network as implicit probabilistic distributions generated by a hypernetwork. Also, we can perform model averaging to improve the network performance. Experimental results demonstrate our regularization method outperformed the commonly-used maximum a posterior (MAP) estimation.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — implicit probabilistic distribution

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Kenya Ukai , Takashi Matsubara , Kuniaki Uehara

Topics

Machine Learning > Optimization & Theory > Bayesian Inference Machine Learning > Optimization & Theory > Stochastic Processes Deep Learning > Architectures > Neural Networks Machine Learning > Bayesian & Probabilistic > Bayesian Inference Deep Learning > Optimization & Theory > Model Compression

Keywords

bayesian inference posterior estimation model averaging convolutional neural network implicit probabilistic distribution

Download PDF

Related papers

Unsupervised Heterogeneous Domain Adaptation with Sparse Feature Transformation 2018

Structured Gaussian Processes with Twin Multiple Kernel Learning 2018

Discriminative Feature Representation for Person Re-identification by Batch-contrastive Loss 2018

Adversarial TableQA: Attention Supervision for Question Answering on Tables 2018

Who Are Raising Their Hands? Hand-Raiser Seeking Based on Object Detection and Pose Estimation 2018