Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Neural Network Optimization
902 directly classified papers
Papers per year
2007: 1
2009: 1
2010: 2
2011: 1
2012: 3
2013: 4
2014: 1
2015: 9
2016: 14
2017: 20
2018: 30
2019: 66
2020: 127
2021: 106
2022: 117
2023: 106
2024: 190
2025: 100
2026: 4
Papers
DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models
CVPR 2024
Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning
CVPR 2024
Expanding Sparse Tuning for Low Memory Usage
NIPS 2024
Neural Redshift: Random Networks are not Random Functions
CVPR 2024
Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers
CVPR 2024
A Layer-Wise Natural Gradient Optimizer for Training Deep Neural Networks
NIPS 2024
Deep linear networks for regression are implicitly regularized towards flat minima
NIPS 2024
Understanding and Minimising Outlier Features in Transformer Training
NIPS 2024
Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling
NIPS 2024
Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference
NIPS 2024
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
NIPS 2024
Explicit Eigenvalue Regularization Improves Sharpness-Aware Minimization
NIPS 2024
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
NIPS 2024
Why is parameter averaging beneficial in SGD? An objective smoothing perspective
AISTATS 2024
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning
AISTATS 2024
Sharpened Lazy Incremental Quasi-Newton Method
AISTATS 2024
Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes
NIPS 2024
ADOPT: Modified Adam Can Converge with Any $\beta_2$ with the Optimal Rate
NIPS 2024
On the Use of Anchoring for Training Vision Models
NIPS 2024
Symmetries in Overparametrized Neural Networks: A Mean Field View
NIPS 2024
Great Minds Think Alike: The Universal Convergence Trend of Input Salience
NIPS 2024
Parameter Symmetry and Noise Equilibrium of Stochastic Gradient Descent
NIPS 2024
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference
EACL 2024
Second-order forward-mode optimization of recurrent neural networks for neuroscience
NIPS 2024
Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization
NIPS 2024
<
1
…
7
8
9
…
37
>