Artificial Intelligence › Core AI ›

Efficient Computing

596 directly classified papers

Papers per year

Papers

METok: Multi-Stage Event-based Token Compression for Efficient Long Video Understanding EMNLP 2025

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework EMNLP 2025

Answer Convergence as a Signal for Early Stopping in Reasoning EMNLP 2025

NeuroAda: Activating Each Neuron’s Potential for Parameter-Efficient Fine-Tuning EMNLP 2025

VisiPruner: Decoding Discontinuous Cross-Modal Dynamics for Efficient Multimodal LLMs EMNLP 2025

GraphKV: Breaking the Static Selection Paradigm with Graph-Based KV Cache Eviction EMNLP 2025

Training-free Neural Architecture Search through Variance of Knowledge of Deep Network Weights CVPR 2025

SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models CVPR 2025

Leveraging Asynchronous Spiking Neural Networks for Ultra Efficient Event-Based Visual Processing AAAI 2025

DIVE into MoE: Diversity-Enhanced Reconstruction of Large Language Models from Dense into Mixture-of-Experts ACL 2025

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention ACL 2025

Beyond Logits: Aligning Feature Dynamics for Effective Knowledge Distillation ACL 2025

RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations IJCAI 2025

COUNTDOWN: Contextually Sparse Activation Filtering Out Unnecessary Weights in Down Projection EMNLP 2025

RefreshKV: Updating Small KV Cache During Long-form Generation ACL 2025

500xCompressor: Generalized Prompt Compression for Large Language Models ACL 2025

Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking ACL 2025

SpindleKV: A Novel KV Cache Reduction Method Balancing Both Shallow and Deep Layers ACL 2025

Retrofitting Large Language Models with Dynamic Tokenization ACL 2025

LAMB: A Training-Free Method to Enhance the Long-Context Understanding of SSMs via Attention-Guided Token Filtering ACL 2025

QUPID: Quantified Understanding for Enhanced Performance, Insights, and Decisions in Korean Search Engines ACL 2025

Variable Layerwise Quantization: A Simple and Effective Approach to Quantize LLMs ACL 2025

EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation ACL 2025

Smarter, Not Harder: Training-Free Adaptive Computation for Transformers ACL 2025

Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers CVPR 2025