Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Neural Network Optimization
902 directly classified papers
Papers per year
2007: 1
2009: 1
2010: 2
2011: 1
2012: 3
2013: 4
2014: 1
2015: 9
2016: 14
2017: 20
2018: 30
2019: 66
2020: 127
2021: 106
2022: 117
2023: 106
2024: 190
2025: 100
2026: 4
Papers
Scalable Optimization in the Modular Norm
NIPS 2024
The Feature Speed Formula: a flexible approach to scale hyper-parameters of deep neural networks
NIPS 2024
Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions
JMLR 2024
Memory-Efficient LLM Training with Online Subspace Descent
NIPS 2024
ADOPT: Modified Adam Can Converge with Any $\beta_2$ with the Optimal Rate
NIPS 2024
On the Inductive Bias of Stacking Towards Improving Reasoning
NIPS 2024
Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers
NIPS 2024
Counter-Current Learning: A Biologically Plausible Dual Network Approach for Deep Learning
NIPS 2024
Changing the Training Data Distribution to Reduce Simplicity Bias Improves In-distribution Generalization
NIPS 2024
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
NIPS 2024
Parameter Symmetry and Noise Equilibrium of Stochastic Gradient Descent
NIPS 2024
Zero-Shot Transfer of Neural ODEs
NIPS 2024
PACE: Pacing Operator Learning to Accurate Optical Field Simulation for Complicated Photonic Devices
NIPS 2024
The Impact of Geometric Complexity on Neural Collapse in Transfer Learning
NIPS 2024
OneBit: Towards Extremely Low-bit Large Language Models
NIPS 2024
Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization
NIPS 2024
Local to Global: Learning Dynamics and Effect of Initialization for Transformers
NIPS 2024
Great Minds Think Alike: The Universal Convergence Trend of Input Salience
NIPS 2024
Learning Neural Contracting Dynamics: Extended Linearization and Global Guarantees
NIPS 2024
Symmetries in Overparametrized Neural Networks: A Mean Field View
NIPS 2024
Neural Collapse To Multiple Centers For Imbalanced Data
NIPS 2024
Dynamic Neural Regeneration: Enhancing Deep Learning Generalization on Small Datasets
NIPS 2024
Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level
NIPS 2024
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
NIPS 2024
DiffLoc: Diffusion Model for Outdoor LiDAR Localization
CVPR 2024
<
1
…
10
11
12
…
37
>