← Optimization & Theory

Deep Learning › Optimization & Theory ›

Neural Network Optimization

902 directly classified papers

Papers per year

Papers

Optimal deep learning of holomorphic operators between Banach spaces NIPS 2024

The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof NIPS 2024

A Layer-Wise Natural Gradient Optimizer for Training Deep Neural Networks NIPS 2024

MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts NIPS 2024

Separation and Bias of Deep Equilibrium Models on Expressivity and Learning Dynamics NIPS 2024

Adaptive Depth Networks with Skippable Sub-Paths NIPS 2024

Provable Acceleration of Nesterov's Accelerated Gradient for Asymmetric Matrix Factorization and Linear Neural Networks NIPS 2024

Sparse maximal update parameterization: A holistic approach to sparse training dynamics NIPS 2024

Benign overfitting in leaky ReLU networks with moderate input dimension NIPS 2024

$\boldsymbol{\mu}\mathbf{P^2}$: Effective Sharpness Aware Minimization Requires Layerwise Perturbation Scaling NIPS 2024

The Challenges of the Nonlinear Regime for Physics-Informed Neural Networks NIPS 2024

Approximated Orthogonal Projection Unit: Stabilizing Regression Network Training Using Natural Gradient NIPS 2024

DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity NIPS 2024

Pipeline Parallelism with Controllable Memory NIPS 2024

Monomial Matrix Group Equivariant Neural Functional Networks NIPS 2024

Achieving Domain-Independent Certified Robustness via Knowledge Continuity NIPS 2024

Provable Tempered Overfitting of Minimal Nets and Typical Nets NIPS 2024

Identifying Spurious Biases Early in Training through the Lens of Simplicity Bias AISTATS 2024

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning NIPS 2024

EG-NAS: Neural Architecture Search with Fast Evolutionary Exploration AAAI 2024

Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit NIPS 2024

Implicit Bias in Noisy-SGD: With Applications to Differentially Private Training AISTATS 2024

DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment NIPS 2024

QLABGrad: A Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning AAAI 2024

The Implicit Bias of Gradient Descent toward Collaboration between Layers: A Dynamic Analysis of Multilayer Perceptions NIPS 2024