← Optimization & Theory

Deep Learning › Optimization & Theory ›

Neural Network Optimization

902 directly classified papers

Papers per year

Papers

MLP Can Be A Good Transformer Learner CVPR 2024

Instance-Aware Group Quantization for Vision Transformers CVPR 2024

Towards More Accurate Diffusion Model Acceleration with A Timestep Tuner CVPR 2024

Monomial Matrix Group Equivariant Neural Functional Networks NIPS 2024

ADOPT: Modified Adam Can Converge with Any $\beta_2$ with the Optimal Rate NIPS 2024

Deep linear networks for regression are implicitly regularized towards flat minima NIPS 2024

Learning Neural Contracting Dynamics: Extended Linearization and Global Guarantees NIPS 2024

Neural Collapse To Multiple Centers For Imbalanced Data NIPS 2024

OneBit: Towards Extremely Low-bit Large Language Models NIPS 2024

Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level NIPS 2024

Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers CVPR 2024

Dynamic Neural Regeneration: Enhancing Deep Learning Generalization on Small Datasets NIPS 2024

The Impact of Geometric Complexity on Neural Collapse in Transfer Learning NIPS 2024

The Feature Speed Formula: a flexible approach to scale hyper-parameters of deep neural networks NIPS 2024

PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization NIPS 2024

BMRS: Bayesian Model Reduction for Structured Pruning NIPS 2024

Improving Training Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architecture CVPR 2024

ADR-X: ANN-Assisted Wireless Link Rate Adaptation for Compute-Constrained Embedded Gaming Devices NSDI 2024

SGD vs GD: Rank Deficiency in Linear Networks NIPS 2024

Memory-Efficient LLM Training with Online Subspace Descent NIPS 2024

PACE: Pacing Operator Learning to Accurate Optical Field Simulation for Complicated Photonic Devices NIPS 2024

Benign overfitting in leaky ReLU networks with moderate input dimension NIPS 2024

$\boldsymbol{\mu}\mathbf{P^2}$: Effective Sharpness Aware Minimization Requires Layerwise Perturbation Scaling NIPS 2024

Continual learning with the neural tangent ensemble NIPS 2024

The Challenges of the Nonlinear Regime for Physics-Informed Neural Networks NIPS 2024