Enhancing Prompt Generation with Adaptive Refinement for Camouflaged Object Detection

Xuehan Chen; Guangyu Ren; Tianhong Dai; Tania Stathaki; Hengyan Liu

2025 ICCV ICCV 2025

Enhancing Prompt Generation with Adaptive Refinement for Camouflaged Object Detection

Abstract

Foundation models, such as Segment Anything Model (SAM), have exhibited remarkable performance in conventional segmentation tasks, primarily due to their training on large-scale datasets. Nonetheless, challenges remain in specific downstream tasks, such as Camouflaged Object Detection (COD). Existing research primarily aims to enhance performance by integrating additional multimodal information derived from other foundation models. However, directly leveraging the information generated by these models may introduce additional biases due to domain shifts. To address this issue, we propose an Adaptive Refinement Module (ARM), which efficiently processes multimodal information and simultaneously refining the mask prompt. Furthermore, we construct an auxiliary embedding that effectively exploits the intermediate information generated during ARM, providing SAM with richer feature representations. Experimental results indicate that our proposed architecture surpasses most state-of-the-art (SOTA) models in the COD task, particularly excelling in structured target segmentation.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Xuehan Chen , Guangyu Ren , Tianhong Dai , Tania Stathaki , Hengyan Liu

Topics

Artificial Intelligence > Core AI > Foundation Models Computer Vision > Analysis > Object Detection Computer Vision > Processing > Semantic Segmentation Computer Vision > Analysis > Object Segmentation Deep Learning > Models > Foundation Models

Keywords

semantic segmentation prompt generation segment anything model camouflaged object detection adaptive refinement mask prompt multimodal information

Download PDF

Related papers

MA-CIR: A Multimodal Arithmetic Benchmark for Composed Image Retrieval 2025

SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality 2025

MonSTeR: a Unified Model for Motion, Scene, Text Retrieval 2025

ASGS: Single-Domain Generalizable Open-Set Object Detection via Adaptive Subgraph Searching 2025

Robust Dataset Condensation using Supervised Contrastive Learning 2025