Artificial Intelligence › Core AI ›

Model Compression

1928 directly classified papers

Papers per year

Papers

Democratizing LLM Efficiency: From Hyperscale Optimizations to Universal Deployability AAAI 2026

Efficient Model Specialization via Training-time and Test-time Adaptation AAAI 2026

CO2-Meter: A Comprehensive Carbon Footprint Estimator for LLMs on Edge Devices AAAI 2026

iSeal: Encrypted Fingerprinting for Reliable LLM Ownership Verification AAAI 2026

Language Model Distillation: A Temporal Difference Imitation Learning Perspective AAAI 2026

SCOPE: Intrinsic Semantic Space Control for Mitigating Copyright Infringement in LLMs AAAI 2026

Making Every Head Count: Sparse Attention Without the Speed-Performance Trade-off AAAI 2026

Outlier Matters: Efficient Long-to-Short Reasoning via Outlier-Guided Model Merging AAAI 2026

ALTER: Asymmetric LoRA for Token-Entropy-Guided Unlearning of LLMs AAAI 2026

PurMM: Attention-Guided Test-Time Backdoor Purification in Multimodal Large Language Models AAAI 2026

Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction AAAI 2026

CP-Router: An Uncertainty-Aware Router Between LLM and LRM AAAI 2026

KeepKV: Achieving Periodic Lossless KV Cache Compression for Efficient LLM Inference AAAI 2026

MCW-KD: Multi-Cost Wasserstein Knowledge Distillation for Large Language Models AAAI 2026

Re-SpS: A Reinforcement Learning Approach to Speculative Sampling AAAI 2026

LoKI: Low-Damage Knowledge Implanting of Large Language Models AAAI 2026

LLM-Oriented Token-Adaptive Knowledge Distillation AAAI 2026

MMBERT: Scaled Mixture-of-Experts Multimodal BERT for Robust Chinese Hate Speech Detection Under Cloaking Perturbations AAAI 2026

V-Pruner: A Fast and Globally-informed Token Pruning Framework for Vision Transformer AAAI 2026

KVmix: Gradient-Based Layer Importance-Aware Mixed-Precision Quantization for KV Cache AAAI 2026

AdaFuse: Accelerating Dynamic Adapter Inference via Token-Level Pre-Gating and Fused Kernel Optimization AAAI 2026

SEAP: Sparse Expert Activation Pruning Unlocks the Brainpower of Large Language Models AAAI 2026

SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning AAAI 2026

GateRA: Token-aware Modulation for Parameter-Efficient Fine-tuning AAAI 2026

SOM Directions Are Better than One: Multi-Directional Refusal Suppression in Language Models AAAI 2026