Artificial Intelligence › Core AI ›

Model Compression

1928 directly classified papers

Papers per year

Papers

PipeThreader: Software-Defined Pipelining for Efficient DNN Execution OSDI 2025

MoQAE: Mixed-Precision Quantization for Long-Context LLM Inference via Mixture of Quantization-Aware Experts ACL 2025

DenseLoRA: Dense Low-Rank Adaptation of Large Language Models ACL 2025

KOEnsAttack: Towards Efficient Data-Free Black-Box Adversarial Attacks via Knowledge-Orthogonalized Substitute Ensembles ICCV 2025

TRNAS: A Training-Free Robust Neural Architecture Search ICCV 2025

ELMGS: Enhancing Memory and Computation Scalability through Compression for 3D Gaussian Splatting WACV 2025

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models ACL 2025

Bitnet.cpp: Efficient Edge Inference for Ternary LLMs ACL 2025

Knowledge Distillation with Refined Logits ICCV 2025

P2 Law: Scaling Law for Post-Training After Model Pruning ACL 2025

GradOT: Training-free Gradient-preserving Offsite-tuning for Large Language Models ACL 2025

Feature Coding in the Era of Large Models: Dataset, Test Conditions, and Benchmark ICCV 2025

SSVQ: Unleashing the Potential of Vector Quantization with Sign-Splitting ICCV 2025

PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models ACL 2025

Positional Overload: Positional Debiasing and Context Window Extension for Large Language Models using Set Encoding ACL 2025

Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations ICCV 2025

LEANCODE: Understanding Models Better for Code Simplification of Pre-trained Large Language Models ACL 2025

TASO: Task-Aligned Sparse Optimization for Parameter-Efficient Model Adaptation EMNLP 2025

A Good Teacher Adapts Their Knowledge for Distillation ICCV 2025

Task-Specific Zero-shot Quantization-Aware Training for Object Detection ICCV 2025

ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba ICCV 2025

Knockoff Branch: Model Stealing Attack via Adding Neurons in the Pre-Trained Model WACV 2025

QJL: 1-Bit Quantized JL Transform for KV Cache Quantization with Zero Overhead AAAI 2025

Can Students Beyond the Teacher? Distilling Knowledge from Teacher’s Bias AAAI 2025

Learning to Rewind via Iterative Prediction of Past Weights for Practical Unlearning AAAI 2025