Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Model Compression
1674 directly classified papers
Papers per year
2012: 1
2013: 2
2014: 2
2015: 7
2016: 9
2017: 27
2018: 51
2019: 79
2020: 189
2021: 165
2022: 206
2023: 207
2024: 325
2025: 399
2026: 5
Papers
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning
ICCV 2025
FocusLLM: Precise Understanding of Long Context by Dynamic Condensing
ACL 2025
Transformed Low-rank Adaptation via Tensor Decomposition and Its Applications to Text-to-image Models
ICCV 2025
Towards compact and efficient Slovak summarization models
ACL 2025
Emulating Self-attention with Convolution for Efficient Image Super-Resolution
ICCV 2025
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers
ICCV 2025
Text Embedding Knows How to Quantize Text-Guided Diffusion Models
ICCV 2025
LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching
ICCV 2025
A Good Teacher Adapts Their Knowledge for Distillation
ICCV 2025
Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition
ACL 2025
Improving Continual Pre-training Through Seamless Data Packing
ACL 2025
MiniKV: Pushing the Limits of 2-Bit KV Cache via Compression and System Co-Design for Efficient Long Context Inference
ACL 2025
Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models
ICCV 2025
Efficient Fine-Tuning of Large Models via Nested Low-Rank Adaptation
ICCV 2025
FREE: Fast and Robust Vision Language Models with Early Exits
ACL 2025
QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning
ICCV 2025
VFM-Adapter: Adapting Visual Foundation Models for Dense Prediction with Dynamic Hybrid Operation Mapping
AAAI 2025
Accelerating Diffusion Transformer via Gradient-Optimized Cache
ICCV 2025
TCFG: Truncated Classifier-Free Guidance for Efficient and Scalable Text-to-Image Acceleration
ICCV 2025
BitNet: 1-bit Pre-training for Large Language Models
JMLR 2025
Recall with Reasoning: Chain-of-Thought Distillation for Mamba’s Long-Context Memory and Extrapolation
EMNLP 2025
Slender-Mamba: Fully Quantized Mamba in 1.58 Bits From Head to Toe
COLING 2025
Improving Reasoning Capabilities in Small Models through Mixture-of-layers Distillation with Stepwise Attention on Key Information
EMNLP 2025
Lightweight Safety Guardrails Using Fine-tuned BERT Embeddings
COLING 2025
Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework
EMNLP 2025
<
1
…
7
8
9
…
67
>