Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Theory
1072 directly classified papers
Papers per year
2007: 1
2010: 4
2011: 1
2012: 3
2013: 4
2014: 5
2015: 2
2016: 11
2017: 31
2018: 47
2019: 67
2020: 97
2021: 128
2022: 225
2023: 155
2024: 209
2025: 81
2026: 1
Papers
Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis
EMNLP 2024
Unveiling Linguistic Regions in Large Language Models
ACL 2024
On the Empirical Complexity of Reasoning and Planning in LLMs
EMNLP 2024
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
NIPS 2024
The Fine-Grained Complexity of Gradient Computation for Training Large Language Models
NIPS 2024
Double-Descent Curves in Neural Networks: A New Perspective Using Gaussian Processes
AAAI 2024
AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters
ACL 2024
Unraveling the Gradient Descent Dynamics of Transformers
NIPS 2024
Epistemic Uncertainty Quantification For Pre-Trained Neural Networks
CVPR 2024
Exponential Hardness of Optimization from the Locality in Quantum Neural Networks
AAAI 2024
Scaling Laws Across Model Architectures: A Comparative Analysis of Dense and MoE Models in Large Language Models
EMNLP 2024
Polyhedral Complex Derivation from Piecewise Trilinear Networks
NIPS 2024
Nonlinear dynamics of localization in neural receptive fields
NIPS 2024
A Phase Transition between Positional and Semantic Learning in a Solvable Model of Dot-Product Attention
NIPS 2024
Understanding Surprising Generalization Phenomena in Deep Learning
AAAI 2024
How does Gradient Descent Learn Features --- A Local Analysis for Regularized Two-Layer Neural Networks
NIPS 2024
Towards Constituting Mathematical Structures for Learning to Optimize
ICML 2023
Unveiling The Mask of Position-Information Pattern Through the Mist of Image Features
ICML 2023
How Powerful are Shallow Neural Networks with Bandlimited Random Weights?
ICML 2023
Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation
ICML 2023
On the Correctness of Automatic Differentiation for Neural Networks with Machine-Representable Parameters
ICML 2023
FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation
ICML 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
ICML 2023
Benign Overfitting in Two-layer ReLU Convolutional Neural Networks
ICML 2023
Emergent Asymmetry of Precision and Recall for Measuring Fidelity and Diversity of Generative Models in High Dimensions
ICML 2023
<
1
…
11
12
13
…
43
>