← Application Areas

Machine Learning › Application Areas ›

Model Compression

1503 directly classified papers

Papers per year

Papers

RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates ACL 2025

AnalyticKWS: Towards Exemplar-Free Analytic Class Incremental Learning for Small-footprint Keyword Spotting ACL 2025

DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization ACL 2025

ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal Transformations ACL 2025

StitchLLM: Serving LLMs, One Block at a Time ACL 2025

Scaling Laws and Efficient Inference for Ternary Language Models ACL 2025

Automated Fine-Grained Mixture-of-Experts Quantization ACL 2025

From Teacher to Student: Tracking Memorization Through Model Distillation ACL 2025

Pretraining Context Compressor for Large Language Models with Embedding-Based Memory ACL 2025

MT2ST: Adaptive Multi-Task to Single-Task Learning ACL 2025

Mitigating Sequential Dependencies: A Survey of Algorithms and Systems for Generation-Refinement Frameworks in Autoregressive Models EMNLP 2025

GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression EMNLP 2025

Studying the Role of Input-Neighbor Overlap in Retrieval-Augmented Language Models Training Efficiency EMNLP 2025

COUNTDOWN: Contextually Sparse Activation Filtering Out Unnecessary Weights in Down Projection EMNLP 2025

CLMTracing: Black-box User-level Watermarking for Code Language Model Tracing EMNLP 2025

Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis ACL 2025

ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs ACL 2025

Speed Without Sacrifice: Fine-Tuning Language Models with Medusa and Knowledge Distillation in Travel Applications ACL 2025

LSSF: Safety Alignment for Large Language Models through Low-Rank Safety Subspace Fusion ACL 2025

HD-PiSSA: High-Rank Distributed Orthogonal Adaptation EMNLP 2025

LLaVA Steering: Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering ACL 2025

General Compression Framework for Efficient Transformer Object Tracking ICCV 2025

AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models EMNLP 2025

AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model ICCV 2025

ZipVL: Accelerating Vision-Language Models through Dynamic Token Sparsity ICCV 2025