Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Neural Network Optimization
902 directly classified papers
Papers per year
2007: 1
2009: 1
2010: 2
2011: 1
2012: 3
2013: 4
2014: 1
2015: 9
2016: 14
2017: 20
2018: 30
2019: 66
2020: 127
2021: 106
2022: 117
2023: 106
2024: 190
2025: 100
2026: 4
Papers
An Efficient High-dimensional Gradient Estimator for Stochastic Differential Equations
NIPS 2024
A Layer-Wise Natural Gradient Optimizer for Training Deep Neural Networks
NIPS 2024
Post-Hoc Reversal: Are We Selecting Models Prematurely?
NIPS 2024
Optimal deep learning of holomorphic operators between Banach spaces
NIPS 2024
Continual learning with the neural tangent ensemble
NIPS 2024
Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling
NIPS 2024
Parameter Symmetry and Noise Equilibrium of Stochastic Gradient Descent
NIPS 2024
Increasing Biases Can Be More Efficient Than Increasing Weights
WACV 2024
Almost Sure Convergence Rates Analysis and Saddle Avoidance of Stochastic Gradient Methods
JMLR 2024
Explicit Eigenvalue Regularization Improves Sharpness-Aware Minimization
NIPS 2024
Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference
NIPS 2024
Understanding Representation of Deep Equilibrium Models from Neural Collapse Perspective
NIPS 2024
Learning Discretized Neural Networks under Ricci Flow
JMLR 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
NIPS 2024
On Feature Learning in Structured State Space Models
NIPS 2024
Loki: Low-rank Keys for Efficient Sparse Attention
NIPS 2024
Ordered Momentum for Asynchronous SGD
NIPS 2024
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
NIPS 2024
Understanding and Minimising Outlier Features in Transformer Training
NIPS 2024
Why Do We Need Weight Decay in Modern Deep Learning?
NIPS 2024
Sharpness-Aware Minimization and the Edge of Stability
JMLR 2024
Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training
NIPS 2024
Scaled Conjugate Gradient Method for Nonconvex Optimization in Deep Neural Networks
JMLR 2024
DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment
NIPS 2024
Local to Global: Learning Dynamics and Effect of Initialization for Transformers
NIPS 2024
<
1
…
8
9
10
…
37
>