← Optimization & Theory

Deep Learning › Optimization & Theory ›

Neural Network Optimization

902 directly classified papers

Papers per year

Papers

Adaptive Dropout: Unleashing Dropout across Layers for Generalizable Image Super-Resolution CVPR 2025

DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention CVPR 2025

Adaptive Non-Uniform Timestep Sampling for Accelerating Diffusion Model Training CVPR 2025

Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer ACL 2025

DeepLA-Net: Very Deep Local Aggregation Networks for Point Cloud Analysis CVPR 2025

Large Learning Rates Simultaneously Achieve Robustness to Spurious Correlations and Compressibility ICCV 2025

QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning ICCV 2025

Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention ACL 2025

Optimized Gradient Clipping for Noisy Label Learning AAAI 2025

Value Residual Learning ACL 2025

Error Analysis Affected by Heavy-Tailed Gradients for Non-Convex Pairwise Stochastic Gradient Descent AAAI 2025

SuBiTO: Synopsis-based Training Optimization for Continuous Real-Time Neural Learning over Big Streaming Data AAAI 2025

FREE: Fast and Robust Vision Language Models with Early Exits ACL 2025

Run LoRA Run: Faster and Lighter LoRA Implementations ACL 2025

A Semantic-Aware Layer-Freezing Approach to Computation-Efficient Fine-Tuning of Language Models ACL 2025

Low-Rank Interconnected Adaptation across Layers ACL 2025

ECHO-LLaMA: Efficient Caching for High-Performance LLaMA Training EMNLP 2025

Parameter-Efficient Fine-Tuning via Circular Convolution ACL 2025

MONAQ: Multi-Objective Neural Architecture Querying for Time-Series Analysis on Resource-Constrained Devices EMNLP 2025

Revealing the Deceptiveness of Knowledge Editing: A Mechanistic Analysis of Superficial Editing ACL 2025

Segment-Based Attention Masking for GPTs ACL 2025

MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection ACL 2025

Language Models Grow Less Humanlike beyond Phase Transition ACL 2025

Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning ACL 2025

Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints ICCV 2025