2026 AAAI AAAI 2026

DeFT-LoRA: Decoupled and Fused Tuning with LoRA Experts for Universal Cross-Domain Retrieval

Abstract

Abstract Universal Cross-Domain Retrieval (UCDR) aims to retrieve images across unseen domains and categories, a critical capability for real-world applications. While large-scale Vision-Language Models (VLMs) like CLIP offer strong zero-shot category generalization, they struggle with domain shifts. Existing methods often improve domain robustness at the cost of high computational overhead or by compromising the VLM's inherent knowledge. To address this, we propose Decoupled and Fused Tuning with LoRA (DeFT-LoRA), a novel and parameter-efficient framework that integrates Low-Rank Adaptation (LoRA) with a Mixture-of-Experts (MoE) mechanism. This approach resolves the intrinsic conflict between domain-invariant and domain-specific knowledge in a single adapter, enabling our model to construct a domain adapters for each input image. We propose a three-stage training strategy, which first learns a shared Base LoRA for domain-invariant features, then derives Domain-Specific Experts to capture specific styles, and finally fuses them dynamically with a lightweight gating network. Extensive experiments on three UCDR benchmarks demonstrate that DeFT-LoRA achieves comparable or superior performance to state-of-the-art methods while requiring only 1.46 percent of CLIP's image-encoder parameters and reducing computational overhead, thereby establishing an exceptional balance between accuracy and efficiency.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio