Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Neural Network Optimization
902 directly classified papers
Papers per year
2007: 1
2009: 1
2010: 2
2011: 1
2012: 3
2013: 4
2014: 1
2015: 9
2016: 14
2017: 20
2018: 30
2019: 66
2020: 127
2021: 106
2022: 117
2023: 106
2024: 190
2025: 100
2026: 4
Papers
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
ACL 2025
Value Residual Learning
ACL 2025
Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer
ACL 2025
Run LoRA Run: Faster and Lighter LoRA Implementations
ACL 2025
Towards Stable and Storage-efficient Dataset Distillation: Matching Convexified Trajectory
CVPR 2025
WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models
CVPR 2025
ICP: Immediate Compensation Pruning for Mid-to-high Sparsity
CVPR 2025
Training-free Neural Architecture Search through Variance of Knowledge of Deep Network Weights
CVPR 2025
Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes
CVPR 2025
Learning from Streaming Video with Orthogonal Gradients
CVPR 2025
Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning
CVPR 2025
Scheduling Weight Transitions for Quantization-Aware Training
ICCV 2025
Adaptive Dropout: Unleashing Dropout across Layers for Generalizable Image Super-Resolution
CVPR 2025
AniGrad: Anisotropic Gradient-Adaptive Sampling for 3D Reconstruction From Monocular Video
CVPR 2025
The Strawberry Problem: Emergence of Character-level Understanding in Tokenized Language Models
EMNLP 2025
LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty
CVPR 2025
LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
EMNLP 2025
Segment-Based Attention Masking for GPTs
ACL 2025
EcoTune: Token-Efficient Multi-Fidelity Hyperparameter Optimization for Large Language Model Inference
EMNLP 2025
Variance Sensitivity Induces Attention Entropy Collapse and Instability in Transformers
EMNLP 2025
LightThinker: Thinking Step-by-Step Compression
EMNLP 2025
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
EMNLP 2025
Enhancing Chain-of-Thought Reasoning via Neuron Activation Differential Analysis
EMNLP 2025
Stepwise Reasoning Checkpoint Analysis: A Test Time Scaling Method to Enhance LLMs’ Reasoning
EMNLP 2025
Last-iterate Convergence of Shuffling Momentum Gradient Method under the Kurdyka-Lojasiewicz Inequality
JMLR 2025
<
1
2
3
4
5
…
37
>