Training Deep Neural Networks with Virtual Smoothing Classes

Zhiyang Zhou; Siwei Wei; Xudong Zhang; Wensheng Dou; Muzi Qu; Yan Cai

2025 AAAI AAAI 2025

Training Deep Neural Networks with Virtual Smoothing Classes

Abstract

Abstract Learning with softmax cross-entropy on one-hot labels often leads to overconfidence on the correct class. While label smoothing regulates this overconfidence by redistributing some confidence from the correct class to other incorrect classes, it compromises the representation in the logits about the similarity between samples of different classes and may hurt calibration if higher confidence is required for high accuracy. To overcome these limitations, we propose a Virtual Smoothing (VS) label that redistributes certain confidence from the correct class to additional VS classes to regularize overconfidence. In VS labels, the VS class nodes act as adversaries to the original class nodes, enforcing regularization by clustering samples across all classes. The zero confidence assigned to each incorrect class also allows the incorrect logits to be different from each other without erasing information about sample similarities. The prediction probability can still approach 1 when applying softmax to the logits of the original real classes, which avoids harming but consistently improves calibration. Experiments show that VS labels consistently improve accuracy and calibration while providing better logits for improved knowledge distillation. Additionally, VS labels exhibit effectiveness in improving adversarial training, robust distillation, and out-of-distribution detection.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — overconfidence regularization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zhiyang Zhou , Siwei Wei , Xudong Zhang , Wensheng Dou , Muzi Qu , Yan Cai

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Loss Functions Deep Learning > Techniques > Normalization Deep Learning > Optimization & Theory > Loss Functions Machine Learning > Learning Types > Regularization

Keywords

model calibration knowledge distillation adversarial training label smoothing cross-entropy loss overconfidence regularization

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025