Artificial Intelligence › Core AI ›

Efficient Computing

596 directly classified papers

Papers per year

Papers

Fuzzy Speculative Decoding for a Tunable Accuracy-Runtime Tradeoff ACL 2025

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework EMNLP 2025

Answer Convergence as a Signal for Early Stopping in Reasoning EMNLP 2025

RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations IJCAI 2025

Seeing More with Less: Human-like Representations in Vision Models CVPR 2025

COUNTDOWN: Contextually Sparse Activation Filtering Out Unnecessary Weights in Down Projection EMNLP 2025

SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models CVPR 2025

Balcony: A Lightweight Approach to Dynamic Inference of Generative Language Models EMNLP 2025

Perception Compressor: A Training-Free Prompt Compression Framework in Long Context Scenarios NAACL 2025

GraphKV: Breaking the Static Selection Paradigm with Graph-Based KV Cache Eviction EMNLP 2025

Position Really Matters: Towards a Holistic Approach for Prompt Tuning NAACL 2025

Reasoning under Uncertainty: Efficient LLM Inference via Unsupervised Confidence Dilution and Convergent Adaptive Sampling EMNLP 2025

MEET: Towards Memory-Efficient Temporal Sparse Deep Neural Networks CVPR 2025

LeanK: Learnable K Cache Channel Pruning for Efficient Decoding EMNLP 2025

Q-TempFusion: Quantization-Aware Temporal Multi-Sensor Fusion on Bird's-Eye View Representation WACV 2025

QSpec: Speculative Decoding with Complementary Quantization Schemes EMNLP 2025

Alignment-Augmented Speculative Decoding with Alignment Sampling and Conditional Verification EMNLP 2025

SpecVLM: Enhancing Speculative Decoding of Video LLMs via Verifier-Guided Token Pruning EMNLP 2025

LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts EMNLP 2025

EcoTune: Token-Efficient Multi-Fidelity Hyperparameter Optimization for Large Language Model Inference EMNLP 2025

Does Acceleration Cause Hidden Instability in Vision Language Models? Uncovering Instance-Level Divergence Through a Large-Scale Empirical Study EMNLP 2025

AnchorAttention: Difference-Aware Sparse Attention with Stripe Granularity EMNLP 2025

DINT Transformer EMNLP 2025

Stop Looking for “Important Tokens” in Multimodal Language Models: Duplication Matters More EMNLP 2025

EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality CVPR 2025