Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Neural Network Optimization
902 directly classified papers
Papers per year
2007: 1
2009: 1
2010: 2
2011: 1
2012: 3
2013: 4
2014: 1
2015: 9
2016: 14
2017: 20
2018: 30
2019: 66
2020: 127
2021: 106
2022: 117
2023: 106
2024: 190
2025: 100
2026: 4
Papers
PACE: Pacing Operator Learning to Accurate Optical Field Simulation for Complicated Photonic Devices
NIPS 2024
The Impact of Geometric Complexity on Neural Collapse in Transfer Learning
NIPS 2024
An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness
NIPS 2024
OneBit: Towards Extremely Low-bit Large Language Models
NIPS 2024
Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks
AISTATS 2024
Learning Neural Contracting Dynamics: Extended Linearization and Global Guarantees
NIPS 2024
Bayes-optimal learning of an extensive-width neural network from quadratically many samples
NIPS 2024
Neural Collapse To Multiple Centers For Imbalanced Data
NIPS 2024
QT-ViT: Improving Linear Attention in ViT with Quadratic Taylor Expansion
NIPS 2024
Dynamic Neural Regeneration: Enhancing Deep Learning Generalization on Small Datasets
NIPS 2024
Improving Equivariant Model Training via Constraint Relaxation
NIPS 2024
Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level
NIPS 2024
DiffLoc: Diffusion Model for Outdoor LiDAR Localization
CVPR 2024
Analysing The Impact of Sequence Composition on Language Model Pre-Training
ACL 2024
EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees
EMNLP 2024
Deep Learning for Computing Convergence Rates of Markov Chains
NIPS 2024
Initialization of Large Language Models via Reparameterization to Mitigate Loss Spikes
EMNLP 2024
Second-order forward-mode optimization of recurrent neural networks for neuroscience
NIPS 2024
Unraveling the Gradient Descent Dynamics of Transformers
NIPS 2024
nanoT5: Fast & Simple Pre-training and Fine-tuning of T5 Models with Limited Resources
EMNLP 2023
Sub-network Discovery and Soft-masking for Continual Learning of Mixed Tasks
EMNLP 2023
Subspace Chronicles: How Linguistic Information Emerges, Shifts and Interacts during Language Model Training
EMNLP 2023
AdaNorm: Adaptive Gradient Norm Correction Based Optimizer for CNNs
WACV 2023
TokenDrop + BucketSampler: Towards Efficient Padding-free Fine-tuning of Language Models
EMNLP 2023
Addressing the Length Bias Challenge in Document-Level Neural Machine Translation
EMNLP 2023
<
1
…
11
12
13
…
37
>