Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Machine Learning
›
Optimization & Theory
›
Neural Network Optimization
3648 directly classified papers
Papers per year
2001: 1
2003: 1
2005: 2
2006: 3
2007: 6
2008: 1
2009: 7
2010: 5
2011: 7
2012: 9
2013: 17
2014: 18
2015: 40
2016: 76
2017: 113
2018: 214
2019: 324
2020: 414
2021: 489
2022: 445
2023: 524
2024: 469
2025: 386
2026: 77
Papers
Training with Confidence: Catching Silent Errors in Deep Learning Training with Automated Proactive Checks
OSDI 2025
LESA: Learnable LLM Layer Scaling-Up
ACL 2025
HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
ACL 2025
Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-up
ACL 2025
IAM: Efficient Inference through Attention Mapping between Different-scale LLMs
ACL 2025
Optimizing RLHF Training for Large Language Models with Stage Fusion
NSDI 2025
MLAS-LoRA: Language-Aware Parameters Detection and LoRA-Based Knowledge Transfer for Multilingual Machine Translation
ACL 2025
Masks Can be Learned as an Alternative to Experts
ACL 2025
Analyzing the Rapid Generalization of SFT via the Perspective of Attention Head Activation Patterns
ACL 2025
StitchLLM: Serving LLMs, One Block at a Time
ACL 2025
Enhancing Talent Search Ranking with Role-Aware Expert Mixtures and LLM-based Fine-Grained Job Descriptions
EMNLP 2025
Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
ACL 2025
HFT: Half Fine-Tuning for Large Language Models
ACL 2025
Positional Overload: Positional Debiasing and Context Window Extension for Large Language Models using Set Encoding
ACL 2025
Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking
ACL 2025
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning
ACL 2025
Flexora: Flexible Low-Rank Adaptation for Large Language Models
ACL 2025
Continual Gradient Low-Rank Projection Fine-Tuning for LLMs
ACL 2025
LLaVA Steering: Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering
ACL 2025
Forward Knows Efficient Backward Path: Saliency-Guided Memory-Efficient Fine-tuning of Large Language Models
ACL 2025
Faster Speculative Decoding via Effective Draft Decoder with Pruned Candidate Tree
ACL 2025
Large Language and Protein Assistant for Protein-Protein Interactions Prediction
ACL 2025
BeamLoRA: Beam-Constraint Low-Rank Adaptation
ACL 2025
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
ACL 2025
Gumbel Reranking: Differentiable End-to-End Reranker Optimization
ACL 2025
<
1
…
8
9
10
…
146
>