Deeply-Supervised Nets

Chen-Yu Lee; Saining Xie; Patrick Gallagher; Zhengyou Zhang; Zhuowen Tu

2015 AISTATS AISTATS 2015

Deeply-Supervised Nets

Abstract

We propose deeply-supervised nets (DSN), a method that simultaneously minimizes classification error and improves the directness and transparency of the hidden layer learning process. We focus our attention on three aspects of traditional convolutional-neural-network-type (CNN-type) architectures: (1) transparency in the effect intermediate layers have on overall classification; (2) discriminativeness and robustness of learned features, especially in early layers; (3) training effectiveness in the face of “vanishing” gradients. To combat these issues, we introduce “companion” objective functions at each hidden layer, in addition to the overall objective function at the output layer (an integrated strategy distinct from layer-wise pre-training). We also analyze our algorithm using techniques extended from stochastic gradient methods. The advantages provided by our method are evident in our experimental results, showing state-of-the-art performance on MNIST, CIFAR-10, CIFAR-100, and SVHN.

🐣 Hot Topic Early Bird — supervised learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Chen-Yu Lee , Saining Xie , Patrick Gallagher , Zhengyou Zhang , Zhuowen Tu

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Neural Network Optimization

Keywords

image classification supervised learning deep learning gradient descent hidden layer convolutional neural network

Download PDF

Related papers

Near-optimal max-affine estimators for convex regression 2015

Sparse Solutions to Nonnegative Linear Systems and Applications 2015

Online Optimization : Competing with Dynamic Comparators 2015

Dimensionality estimation without distances 2015

The Security of Latent Dirichlet Allocation 2015