Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Neural Network Optimization
902 directly classified papers
Papers per year
2007: 1
2009: 1
2010: 2
2011: 1
2012: 3
2013: 4
2014: 1
2015: 9
2016: 14
2017: 20
2018: 30
2019: 66
2020: 127
2021: 106
2022: 117
2023: 106
2024: 190
2025: 100
2026: 4
Papers
Spectral Scaling Laws in Language Models: emphHow Effectively Do Feed-Forward Networks Use Their Latent Space?
EMNLP 2025
Stepwise Reasoning Checkpoint Analysis: A Test Time Scaling Method to Enhance LLMs’ Reasoning
EMNLP 2025
A Proactive Reliability Metric for Detecting Failures in Language Model Training
EMNLP 2025
Exploring smaller batch sizes for a high-performing BabyLM model architecture
EMNLP 2025
Not All Parameters Matter: Masking Diffusion Models for Enhancing Generation Ability
CVPR 2025
Lookahead Q-Cache: Achieving More Consistent KV Cache Eviction via Pseudo Query
EMNLP 2025
Language Models Grow Less Humanlike beyond Phase Transition
ACL 2025
Low-Rank Interconnected Adaptation across Layers
ACL 2025
Segment-Based Attention Masking for GPTs
ACL 2025
Value Residual Learning
ACL 2025
Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning
ACL 2025
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
ACL 2025
Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer
ACL 2025
A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation
EMNLP 2025
FREE: Fast and Robust Vision Language Models with Early Exits
ACL 2025
IG-Pruning: Input-Guided Block Pruning for Large Language Models
EMNLP 2025
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
EMNLP 2025
Variance Sensitivity Induces Attention Entropy Collapse and Instability in Transformers
EMNLP 2025
Where to show Demos in Your Prompt: A Positional Bias of In-Context Learning
EMNLP 2025
GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression
EMNLP 2025
The Strawberry Problem: Emergence of Character-level Understanding in Tokenized Language Models
EMNLP 2025
LightThinker: Thinking Step-by-Step Compression
EMNLP 2025
Revealing the Deceptiveness of Knowledge Editing: A Mechanistic Analysis of Superficial Editing
ACL 2025
NeuroAda: Activating Each Neuron’s Potential for Parameter-Efficient Fine-Tuning
EMNLP 2025
EcoTune: Token-Efficient Multi-Fidelity Hyperparameter Optimization for Large Language Model Inference
EMNLP 2025
<
1
2
3
4
5
…
37
>