← Optimization & Theory

Deep Learning › Optimization & Theory ›

Neural Network Optimization

902 directly classified papers

Papers per year

Papers

Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention ACL 2025

Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer ACL 2025

Run LoRA Run: Faster and Lighter LoRA Implementations ACL 2025

Parameter-Efficient Fine-Tuning via Circular Convolution ACL 2025

Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models CVPR 2024

From Activation to Initialization: Scaling Insights for Optimizing Neural Fields CVPR 2024

Tuning Stable Rank Shrinkage: Aiming at the Overlooked Structural Risk in Fine-tuning CVPR 2024

Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers CVPR 2024

Neural Redshift: Random Networks are not Random Functions CVPR 2024

Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers CVPR 2024

Improving Training Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architecture CVPR 2024

Continual learning with the neural tangent ensemble NIPS 2024

Discovering Knowledge-Critical Subnetworks in Pretrained Language Models EMNLP 2024

Where Do Large Learning Rates Lead Us? NIPS 2024

Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit NIPS 2024

DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment NIPS 2024

Fast Forwarding Low-Rank Training EMNLP 2024

Encourage or Inhibit Monosemanticity? Revisit Monosemanticity from a Feature Decorrelation Perspective EMNLP 2024

Stable Language Model Pre-training by Reducing Embedding Variability EMNLP 2024

Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech EMNLP 2024

Provable Tempered Overfitting of Minimal Nets and Typical Nets NIPS 2024

Achieving Domain-Independent Certified Robustness via Knowledge Continuity NIPS 2024

Abrupt Learning in Transformers: A Case Study on Matrix Completion NIPS 2024

Pipeline Parallelism with Controllable Memory NIPS 2024

Monomial Matrix Group Equivariant Neural Functional Networks NIPS 2024