← Optimization & Theory

Deep Learning › Optimization & Theory ›

Neural Network Optimization

902 directly classified papers

Papers per year

Papers

An Efficient High-dimensional Gradient Estimator for Stochastic Differential Equations NIPS 2024

A Layer-Wise Natural Gradient Optimizer for Training Deep Neural Networks NIPS 2024

Post-Hoc Reversal: Are We Selecting Models Prematurely? NIPS 2024

Optimal deep learning of holomorphic operators between Banach spaces NIPS 2024

Continual learning with the neural tangent ensemble NIPS 2024

Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling NIPS 2024

Parameter Symmetry and Noise Equilibrium of Stochastic Gradient Descent NIPS 2024

Increasing Biases Can Be More Efficient Than Increasing Weights WACV 2024

Almost Sure Convergence Rates Analysis and Saddle Avoidance of Stochastic Gradient Methods JMLR 2024

Explicit Eigenvalue Regularization Improves Sharpness-Aware Minimization NIPS 2024

Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference NIPS 2024

Understanding Representation of Deep Equilibrium Models from Neural Collapse Perspective NIPS 2024

Learning Discretized Neural Networks under Ricci Flow JMLR 2024

On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions NIPS 2024

On Feature Learning in Structured State Space Models NIPS 2024

Loki: Low-rank Keys for Efficient Sparse Attention NIPS 2024

Ordered Momentum for Asynchronous SGD NIPS 2024

DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation NIPS 2024

Understanding and Minimising Outlier Features in Transformer Training NIPS 2024

Why Do We Need Weight Decay in Modern Deep Learning? NIPS 2024

Sharpness-Aware Minimization and the Edge of Stability JMLR 2024

Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training NIPS 2024

Scaled Conjugate Gradient Method for Nonconvex Optimization in Deep Neural Networks JMLR 2024

DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment NIPS 2024

Local to Global: Learning Dynamics and Effect of Initialization for Transformers NIPS 2024