Training Deep Neural Networks via Direct Loss Minimization

Yang Song; Alexander Schwing; Richard; Raquel Urtasun

2016 ICML ICML 2016

Training Deep Neural Networks via Direct Loss Minimization

Abstract

Supervised training of deep neural nets typically relies on minimizing cross-entropy. However, in many domains, we are interested in performing well on metrics specific to the application. In this paper we propose a direct loss minimization approach to train deep neural networks, which provably minimizes the application-specific loss function. This is often non-trivial, since these functions are neither smooth nor decomposable and thus are not amenable to optimization with standard gradient-based methods. We demonstrate the effectiveness of our approach in the context of maximizing average precision for ranking problems. Towards this goal, we develop a novel dynamic programming algorithm that can efficiently compute the weight updates. Our approach proves superior to a variety of baselines in the context of action classification and object detection, especially in the presence of label noise.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — application-specific loss

🐣 Hot Topic Early Bird — neural network training

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yang Song , Alexander Schwing , Richard , Raquel Urtasun

Topics

Machine Learning > Optimization & Theory > Optimization Deep Learning > Techniques > Model Architecture Machine Learning > Core Methods > Optimization Deep Learning > Learning Types > Deep Learning

Keywords

neural network training object detection label noise dynamic programming gradient-based method ranking problem direct loss minimization average precision application-specific loss

Download PDF

Related papers

Associative Long Short-Term Memory 2016

Recycling Randomness with Structure for Sublinear time Kernel Expansions 2016

Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues 2016

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization 2016

Hawkes Processes with Stochastic Excitations 2016