← Optimization & Theory

Deep Learning › Optimization & Theory ›

Efficient Computing

1253 directly classified papers

Papers per year

Papers

Practical Offloading for Fine-Tuning LLM on Commodity GPU via Learned Sparse Projectors AAAI 2025

AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference AAAI 2025

QiMeng-GEMM: Automatically Generating High-Performance Matrix Multiplication Code by Exploiting Large Language Models AAAI 2025

Design Principle Transfer in Neural Architecture Search via Large Language Models AAAI 2025

BSDB-Net: Band-Split Dual-Branch Network with Selective State Spaces Mechanism for Monaural Speech Enhancement AAAI 2025

COSEE: Consistency-Oriented Signal-Based Early Exiting via Calibrated Sample Weighting Mechanism AAAI 2025

Multi-Branch Self-Drafting for LLM Inference Acceleration AAAI 2025

Falcon: Faster and Parallel Inference of Large Language Models Through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree AAAI 2025

Building Vision Models upon Heat Conduction CVPR 2025

ImagePiece: Content-aware Re-tokenization for Efficient Image Recognition AAAI 2025

3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position Encoding AAAI 2025

Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference AAAI 2025

Segment-Based Attention Masking for GPTs ACL 2025

Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity ACL 2025

Error-driven Data-efficient Large Multimodal Model Tuning ACL 2025

CENTAUR: Bridging the Impossible Trinity of Privacy, Efficiency, and Performance in Privacy-Preserving Transformer Inference ACL 2025

VSSD: Vision Mamba with Non-Causal State Space Duality ICCV 2025

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference ACL 2025

StitchLLM: Serving LLMs, One Block at a Time ACL 2025

LLM×MapReduce: Simplified Long-Sequence Processing using Large Language Models ACL 2025

Breaking the Encoder Barrier for Seamless Video-Language Understanding ICCV 2025

KV-Latent: Dimensional-level KV Cache Reduction with Frequency-aware Rotary Positional Embedding ACL 2025

Extending LLM Context Window with Adaptive Grouped Positional Encoding: A Training-Free Method ACL 2025

UniICL: An Efficient ICL Framework Unifying Compression, Selection, and Generation ACL 2025

DiffSkip: Differential Layer Skipping in Large Language Models ACL 2025