← Application Areas

Machine Learning › Application Areas ›

Model Compression

1503 directly classified papers

Papers per year

Papers

MLWQ: Efficient Small Language Model Deployment via Multi-Level Weight Quantization EMNLP 2025

Beyond Low-Rank Tuning: Model Prior-Guided Rank Allocation for Effective Transfer in Low-Data and Large-Gap Regimes. ICCV 2025

Best Practices for Distilling Large Language Models into BERT for Web Search Ranking COLING 2025

LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation ICCV 2025

A Middle Path for On-Premises LLM Deployment: Preserving Privacy Without Sacrificing Model Confidentiality EMNLP 2025

Saliency-Aware Quantized Imitation Learning for Efficient Robotic Control ICCV 2025

Lightweight Safety Guardrails Using Fine-tuned BERT Embeddings COLING 2025

GenieBlue: Integrating both Linguistic and Multimodal Capabilities for Large Language Models on Mobile Devices ICCV 2025

ToDi: Token-wise Distillation via Fine-Grained Divergence Control EMNLP 2025

MixA-Q: Revisiting Activation Sparsity for Vision Transformers from a Mixed-Precision Quantization Perspective ICCV 2025

Efficient Vocabulary Reduction for Small Language Models COLING 2025

A Quality-Guided Mixture of Score-Fusion Experts Framework for Human Recognition ICCV 2025

Mitigating Sequential Dependencies: A Survey of Algorithms and Systems for Generation-Refinement Frameworks in Autoregressive Models EMNLP 2025

Variance-Based Pruning for Accelerating and Compressing Trained Networks ICCV 2025

DadmaTools V2: an Adapter-Based Natural Language Processing Toolkit for the Persian Language COLING 2025

OuroMamba: A Data-Free Quantization Framework for Vision Mamba ICCV 2025

Controlled Low-Rank Adaptation with Subspace Regularization for Continued Training on Large Language Models ACL 2025

Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers ICCV 2025

AAIG at GenAI Detection Task 1: Exploring Syntactically-Aware, Resource-Efficient Small Autoregressive Decoders for AI Content Detection COLING 2025

Make Your Training Flexible: Towards Deployment-Efficient Video Models ICCV 2025

LLaVA Steering: Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering ACL 2025

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models ICCV 2025

Extracting General-use Transformers for Low-resource Languages via Knowledge Distillation COLING 2025

From Holistic to Localized: Local Enhanced Adapters for Efficient Visual Instruction Fine-Tuning ICCV 2025

Pretraining Context Compressor for Large Language Models with Embedding-Based Memory ACL 2025