← Optimization & Theory

Machine Learning › Optimization & Theory ›

Neural Network Optimization

3648 directly classified papers

Papers per year

Papers

Complexity Experts are Task-Discriminative Learners for Any Image Restoration CVPR 2025

LLS: Local Learning Rule for Deep Neural Networks Inspired by Neural Activity Synchronization WACV 2025

Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective COLING 2025

An Information-Theoretic Regularizer for Lossy Neural Image Compression ICCV 2025

Mamba4D: Efficient 4D Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models CVPR 2025

UCSC at SemEval-2025 Task 8: Question Answering over Tabular Data SEMEVAL 2025

SYSTRAN @ IWSLT 2025 Low-resource track ACL 2025

NAIST Simultaneous Speech Translation System for IWSLT 2025 ACL 2025

Capacity Matters: a Proof-of-Concept for Transformer Memorization on Real-World Data ACL 2025

MT2ST: Adaptive Multi-Task to Single-Task Learning ACL 2025

Derivational Probing: Unveiling the Layer-wise Derivation of Syntactic Structures in Neural Language Models ACL 2025

Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review ACL 2025

Cautious Next Token Prediction ACL 2025

Fuzzy Speculative Decoding for a Tunable Accuracy-Runtime Tradeoff ACL 2025

Verbosity-Aware Rationale Reduction: Sentence-Level Rationale Reduction for Efficient and Effective Reasoning ACL 2025

DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization ACL 2025

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect ACL 2025

Stanford MLab at SemEval-2025 Task 11: Track B–Emotion Intensity Detection SEMEVAL 2025

YNU-HPCC at SemEval-2025 Task 6: Using BERT Model with R-drop for Promise Verification SEMEVAL 2025

Smart-Searcher: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning EMNLP 2025

Stanford MLab at SemEval-2025 Task 11: Track B–Emotion Intensity Detection ACL 2025

P3: Prompts Promote Prompting ACL 2025

Slamming: Training a Speech Language Model on One GPU in a Day ACL 2025

PECAN: LLM-Guided Dynamic Progress Control with Attention-Guided Hierarchical Weighted Graph for Long-Document QA ACL 2025

Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models? ACL 2025