Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Neural Network Optimization
902 directly classified papers
Papers per year
2007: 1
2009: 1
2010: 2
2011: 1
2012: 3
2013: 4
2014: 1
2015: 9
2016: 14
2017: 20
2018: 30
2019: 66
2020: 127
2021: 106
2022: 117
2023: 106
2024: 190
2025: 100
2026: 4
Papers
Adaptive Dropout: Unleashing Dropout across Layers for Generalizable Image Super-Resolution
CVPR 2025
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
CVPR 2025
Adaptive Non-Uniform Timestep Sampling for Accelerating Diffusion Model Training
CVPR 2025
Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer
ACL 2025
DeepLA-Net: Very Deep Local Aggregation Networks for Point Cloud Analysis
CVPR 2025
Large Learning Rates Simultaneously Achieve Robustness to Spurious Correlations and Compressibility
ICCV 2025
QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning
ICCV 2025
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
ACL 2025
Optimized Gradient Clipping for Noisy Label Learning
AAAI 2025
Value Residual Learning
ACL 2025
Error Analysis Affected by Heavy-Tailed Gradients for Non-Convex Pairwise Stochastic Gradient Descent
AAAI 2025
SuBiTO: Synopsis-based Training Optimization for Continuous Real-Time Neural Learning over Big Streaming Data
AAAI 2025
FREE: Fast and Robust Vision Language Models with Early Exits
ACL 2025
Run LoRA Run: Faster and Lighter LoRA Implementations
ACL 2025
A Semantic-Aware Layer-Freezing Approach to Computation-Efficient Fine-Tuning of Language Models
ACL 2025
Low-Rank Interconnected Adaptation across Layers
ACL 2025
ECHO-LLaMA: Efficient Caching for High-Performance LLaMA Training
EMNLP 2025
Parameter-Efficient Fine-Tuning via Circular Convolution
ACL 2025
MONAQ: Multi-Objective Neural Architecture Querying for Time-Series Analysis on Resource-Constrained Devices
EMNLP 2025
Revealing the Deceptiveness of Knowledge Editing: A Mechanistic Analysis of Superficial Editing
ACL 2025
Segment-Based Attention Masking for GPTs
ACL 2025
MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection
ACL 2025
Language Models Grow Less Humanlike beyond Phase Transition
ACL 2025
Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning
ACL 2025
Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints
ICCV 2025
<
1
2
3
4
5
…
37
>