← Application Areas

Machine Learning › Application Areas ›

Model Compression

1503 directly classified papers

Papers per year

Papers

SpecVLM: Enhancing Speculative Decoding of Video LLMs via Verifier-Guided Token Pruning EMNLP 2025

LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts EMNLP 2025

Does quantization affect models’ performance on long-context tasks? EMNLP 2025

DisLoRA: Task-specific Low-Rank Adaptation via Orthogonal Basis from Singular Value Decomposition EMNLP 2025

WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models CVPR 2025

EA-Vit: Efficient Adaptation for Elastic Vision Transformer ICCV 2025

LiON-LoRA: Rethinking LoRA Fusion to Unify Controllable Spatial and Temporal Generation for Video Diffusion ICCV 2025

Binarized Neural Network for Multi-spectral Image Fusion CVPR 2025

MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning CVPR 2025

Treasures in Discarded Weights for LLM Quantization AAAI 2025

One-Shot Knowledge Transfer for Scalable Person Re-Identification ICCV 2025

Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models ICCV 2025

Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration ICCV 2025

BiLD: Bi-directional Logits Difference Loss for Large Language Model Distillation COLING 2025

MEGA: Memory-Efficient 4D Gaussian Splatting for Dynamic Scenes ICCV 2025

DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization ACL 2025

MSQ: Memory-Efficient Bit Sparsification Quantization ICCV 2025

Resource-Efficient Anonymization of Textual Data via Knowledge Distillation from Large Language Models COLING 2025

Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning ICCV 2025

Pretraining Context Compressor for Large Language Models with Embedding-Based Memory ACL 2025

Robust Machine Unlearning for Quantized Neural Networks via Adaptive Gradient Reweighting with Similar Labels ICCV 2025

FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference EMNLP 2025

StolenLoRA: Exploring LoRA Extraction Attacks via Synthetic Data ICCV 2025

RED-CT: A Systems Design Methodology for Using LLM-labeled Data to Train and Deploy Edge Linguistic Classifiers COLING 2025

ZipVL: Accelerating Vision-Language Models through Dynamic Token Sparsity ICCV 2025