Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Theory
1072 directly classified papers
Papers per year
2007: 1
2010: 4
2011: 1
2012: 3
2013: 4
2014: 5
2015: 2
2016: 11
2017: 31
2018: 47
2019: 67
2020: 97
2021: 128
2022: 225
2023: 155
2024: 209
2025: 81
2026: 1
Papers
On the Expressivity Role of LayerNorm in Transformers’ Attention
ACL 2023
On double-descent in uncertainty quantification in overparametrized models
AISTATS 2023
Complex-valued Neurons Can Learn More but Slower than Real-valued Neurons via Gradient Descent
NIPS 2023
Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs
NIPS 2023
Analysis of Catastrophic Forgetting for Random Orthogonal Transformation Tasks in the Overparameterized Regime
AISTATS 2023
Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models
NIPS 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
AAAI 2023
Window-Based Distribution Shift Detection for Deep Neural Networks
NIPS 2023
Transformers learn through gradual rank increase
NIPS 2023
Polynomially Over-Parameterized Convolutional Neural Networks Contain Structured Strong Winning Lottery Tickets
NIPS 2023
Manifold-Preserving Transformers are Effective for Short-Long Range Encoding
EMNLP 2023
Scaling Law for Document Neural Machine Translation
EMNLP 2023
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
EMNLP 2023
Emergent Inabilities? Inverse Scaling Over the Course of Pretraining
EMNLP 2023
Understanding Imbalanced Semantic Segmentation Through Neural Collapse
CVPR 2023
VNE: An Effective Method for Improving Deep Representation by Manipulating Eigenvalue Distribution
CVPR 2023
Exploring the Relationship Between Architectural Design and Adversarially Robust Generalization
CVPR 2023
SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries
CVPR 2023
Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations
CVPR 2023
Initialization Noise in Image Gradients and Saliency Maps
CVPR 2023
Neural Representations Reveal Distinct Modes of Class Fitting in Residual Convolutional Networks
AAAI 2023
Local Intrinsic Dimensional Entropy
AAAI 2023
The Implicit Regularization of Momentum Gradient Descent in Overparametrized Models
AAAI 2023
On Data Scaling in Masked Image Modeling
CVPR 2023
On the Ability of Graph Neural Networks to Model Interactions Between Vertices
NIPS 2023
<
1
…
13
14
15
…
43
>