Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Theory
1072 directly classified papers
Papers per year
2007: 1
2010: 4
2011: 1
2012: 3
2013: 4
2014: 5
2015: 2
2016: 11
2017: 31
2018: 47
2019: 67
2020: 97
2021: 128
2022: 225
2023: 155
2024: 209
2025: 81
2026: 1
Papers
Convergence of Message-Passing Graph Neural Networks with Generic Aggregation on Large Random Graphs
JMLR 2024
Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK
JMLR 2024
The Loss Landscape of Deep Linear Neural Networks: a Second-order Analysis
JMLR 2024
High Probability Convergence Bounds for Non-convex Stochastic Gradient Descent with Sub-Weibull Noise
JMLR 2024
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers
NIPS 2024
Law of Large Numbers and Central Limit Theorem for Wide Two-layer Neural Networks: The Mini-Batch and Noisy Case
JMLR 2024
A PDE-based Explanation of Extreme Numerical Sensitivities and Edge of Stability in Training Neural Networks
JMLR 2024
Generalization and Stability of Interpolating Neural Networks with Minimal Width
JMLR 2024
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks
JMLR 2024
The Effect of Generalisation on the Inadequacy of the Mode
EACL 2024
Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms
NIPS 2024
Improving Normalization With the James-Stein Estimator
WACV 2024
The Expressive Capacity of State Space Models: A Formal Language Perspective
NIPS 2024
Overparametrized Multi-layer Neural Networks: Uniform Concentration of Neural Tangent Kernel and Convergence of Stochastic Gradient Descent
JMLR 2024
The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective
JMLR 2024
From Activation to Initialization: Scaling Insights for Optimizing Neural Fields
CVPR 2024
Multiple Descent in the Multiple Random Feature Model
JMLR 2024
On the Effect of Initialization: The Scaling Path of 2-Layer Neural Networks
JMLR 2024
In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization
NIPS 2024
Globally Convergent Variational Inference
NIPS 2024
Curvature Clues: Decoding Deep Learning Privacy with Input Loss Curvature
NIPS 2024
Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling
NIPS 2024
Effect of Ambient-Intrinsic Dimension Gap on Adversarial Vulnerability
AISTATS 2024
Separation and Bias of Deep Equilibrium Models on Expressivity and Learning Dynamics
NIPS 2024
Scaling Laws for Data Filtering-- Data Curation cannot be Compute Agnostic
CVPR 2024
<
1
…
4
5
6
…
43
>