← Optimization & Theory

Deep Learning › Optimization & Theory ›

Neural Network Optimization

902 directly classified papers

Papers per year

Papers

Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention ACL 2025

Value Residual Learning ACL 2025

Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer ACL 2025

Run LoRA Run: Faster and Lighter LoRA Implementations ACL 2025

Towards Stable and Storage-efficient Dataset Distillation: Matching Convexified Trajectory CVPR 2025

WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models CVPR 2025

ICP: Immediate Compensation Pruning for Mid-to-high Sparsity CVPR 2025

Training-free Neural Architecture Search through Variance of Knowledge of Deep Network Weights CVPR 2025

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes CVPR 2025

Learning from Streaming Video with Orthogonal Gradients CVPR 2025

Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning CVPR 2025

Scheduling Weight Transitions for Quantization-Aware Training ICCV 2025

Adaptive Dropout: Unleashing Dropout across Layers for Generalizable Image Super-Resolution CVPR 2025

AniGrad: Anisotropic Gradient-Adaptive Sampling for 3D Reconstruction From Monocular Video CVPR 2025

The Strawberry Problem: Emergence of Character-level Understanding in Tokenized Language Models EMNLP 2025

LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty CVPR 2025

LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation EMNLP 2025

Segment-Based Attention Masking for GPTs ACL 2025

EcoTune: Token-Efficient Multi-Fidelity Hyperparameter Optimization for Large Language Model Inference EMNLP 2025

Variance Sensitivity Induces Attention Entropy Collapse and Instability in Transformers EMNLP 2025

LightThinker: Thinking Step-by-Step Compression EMNLP 2025

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching EMNLP 2025

Enhancing Chain-of-Thought Reasoning via Neuron Activation Differential Analysis EMNLP 2025

Stepwise Reasoning Checkpoint Analysis: A Test Time Scaling Method to Enhance LLMs’ Reasoning EMNLP 2025

Last-iterate Convergence of Shuffling Momentum Gradient Method under the Kurdyka-Lojasiewicz Inequality JMLR 2025