← Optimization & Theory

Deep Learning › Optimization & Theory ›

Neural Network Optimization

902 directly classified papers

Papers per year

Papers

Revisiting Downsampling in Semantic Segmentation: Fighting Aliasing with Dynamic Gaussian and Gabor Frequency Filters AAAI 2026

Revisiting Network Inertia: Dynamic Inertia Inhibition Coupled Multidimensional Periodicity for Infrared and Visible Image Fusion AAAI 2026

Universal Neural Architecture Space: Covering ConvNets, Transformers and Everything in Between WACV 2026

Revisiting Layer Normalization for Point Cloud Test Time Adaptation WACV 2026

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching EMNLP 2025

Variance Sensitivity Induces Attention Entropy Collapse and Instability in Transformers EMNLP 2025

Balcony: A Lightweight Approach to Dynamic Inference of Generative Language Models EMNLP 2025

Run LoRA Run: Faster and Lighter LoRA Implementations ACL 2025

LightThinker: Thinking Step-by-Step Compression EMNLP 2025

NeuroAda: Activating Each Neuron’s Potential for Parameter-Efficient Fine-Tuning EMNLP 2025

Stepwise Reasoning Checkpoint Analysis: A Test Time Scaling Method to Enhance LLMs’ Reasoning EMNLP 2025

The Role of Outgoing Connection Heterogeneity in Feedforward Layers of Large Language Models EMNLP 2025

LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation EMNLP 2025

EcoTune: Token-Efficient Multi-Fidelity Hyperparameter Optimization for Large Language Model Inference EMNLP 2025

Segment-Based Attention Masking for GPTs ACL 2025

Language Models Grow Less Humanlike beyond Phase Transition ACL 2025

Revealing the Deceptiveness of Knowledge Editing: A Mechanistic Analysis of Superficial Editing ACL 2025

MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection ACL 2025

Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention ACL 2025

Value Residual Learning ACL 2025

FREE: Fast and Robust Vision Language Models with Early Exits ACL 2025

Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer ACL 2025

A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation EMNLP 2025

Low-Rank Interconnected Adaptation across Layers ACL 2025

Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning ACL 2025