← Optimization & Theory

Deep Learning › Optimization & Theory ›

Model Compression

1674 directly classified papers

Papers per year

Papers

AAIG at GenAI Detection Task 1: Exploring Syntactically-Aware, Resource-Efficient Small Autoregressive Decoders for AI Content Detection COLING 2025

MSQ: Memory-Efficient Bit Sparsification Quantization ICCV 2025

FT-MDT: Extracting Decision Trees from Medical Texts via a Novel Low-rank Adaptation Method EMNLP 2025

TCFG: Truncated Classifier-Free Guidance for Efficient and Scalable Text-to-Image Acceleration ICCV 2025

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models ICCV 2025

DMQ: Dissecting Outliers of Diffusion Models for Post-Training Quantization ICCV 2025

OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models ICCV 2025

LUT-Fuse: Towards Extremely Fast Infrared and Visible Image Fusion via Distillation to Learnable Look-Up Tables ICCV 2025

Can We Achieve Efficient Diffusion Without Self-Attention? Distilling Self-Attention into Convolutions ICCV 2025

FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing NAACL 2025

The Impact of Inference Acceleration on Bias of LLMs NAACL 2025

Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding NAACL 2025

QPruner: Probabilistic Decision Quantization for Structured Pruning in Large Language Models NAACL 2025

LVPruning: An Effective yet Simple Language-Guided Vision Token Pruning Approach for Multi-modal Large Language Models NAACL 2025

MoLA: MoE LoRA with Layer-wise Expert Allocation NAACL 2025

Avoiding Copyright Infringement via Large Language Model Unlearning NAACL 2025

MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding NAACL 2025

RankAdaptor: Hierarchical Rank Allocation for Efficient Fine-Tuning Pruned LLMs via Performance Model NAACL 2025

As easy as PIE: understanding when pruning causes language models to disagree NAACL 2025

UNLEARN Efficient Removal of Knowledge in Large Language Models NAACL 2025

Aligning Sizes of Intermediate Layers by LoRA Adapter for Knowledge Distillation NAACL 2025

Encoder-Aware Sequence-Level Knowledge Distillation for Low-Resource Neural Machine Translation NAACL 2025

Large Language Models Are Overparameterized Text Encoders NAACL 2025

Vocabulary-level Memory Efficiency for Language Model Fine-tuning NAACL 2025

Portcullis: A Scalable and Verifiable Privacy Gateway for Third-Party LLM Inference AAAI 2025