Authors: Xuying Zhang, Bowen Yin, Zheng Lin, Qibin Hou, Deng-Ping Fan, Ming-Ming Cheng.
This repo contains the official dataset and source code of the paper Referring Camouflaged Object Detection.
In this paper, we consider the problem of referring camouflaged object detection (Ref-COD), a new task that aims to segment specified
camouflaged objects based on a small set of referring images with salient target objects.
Fig. 1: Visual comparison between the standard COD and our Ref-COD.
Given an image containing multiple camouflaged objects, the COD
model tends to find all possible camouflaged objects that are blended
into the background without discrimination, while the Ref-COD model
attempts to identify the camouflaged objects under the condition of a set
of referring images.
For technical questions, feel free to contact zhangxuying1004@gmail.com and bowenyin@mail.nankai.edu.cn; For commercial licensing, please contact cmm@nankai.edu.cn. If our work gives some inspiration to you, please cite it (BibTeX) and star this project. Thank you!
Note that I will upload the codes later, including:
- The embedding process of the common representations of target objects;
- The attribution evaluation of different COD / Ref-COD methods;
- Visualization.
- Other tools.
And you can first use my processed representations at the below dataset link if you are interested in our Ref-COD topic.
conda env create -f environment.yml
conda activate refcod
1. Dataset.
Fig. 2. Examples from our R2C7K dataset. Note that the camouflaged objects in Camo-subset are masked with their annotations in orange.
- Download our ensembled R2C7K dataset with access code
2013
on Baidu Netdisk.
├── R2C7K
├── Camo
├── train # training set of camo-subset with 64 categories.
└── test # tesing set of camo-subset with 64 categories.
├── Ref
├── Images # all images of ref-subset with 64 categories.
├── RefFeat_ICON-R # all object representations of ref-subset with 64 categories.
└── Saliency_ICON-R # all foreground maps of ref-subset with 64 categories.
- Update the 'data_root' param with your R2C7K location in
train.py
,infer.py
andtest.py
.
2. Framework
Fig. 3. Overall architecture of our R2CNet framework, which is composed of two branches, i.e., reference branch in green and segmentation branch
in orange. In the reference branch, the common representation of a specified object from images is obtained by masking and pooling the visual
features with the foreground map generated by a SOD network. In the segmentation branch, the visual features from the last three layers of the
encoder are employed to represent the given image. Then, these two kinds of feature representations are fused and compared in the well-designed
RMG module to generate a mask prior, which is used to enrich the visual feature among different scales to highlight the camouflaged targets in our
RFE module. Finally, the enriched features are fed into the decoder to generate the final segmentation map. DSF: Dual-source Information Fusion, MSF: Multi-scale Feature Fusion, TM: Target Matching.
3. Infer.
- Download the pre-trained r2cnet.pth checkpoints with access code
2023
on Baidu Netdisk. - Put the checkpoint file on './snapshot/saved_models/'.
- Run
python infer.py
to generate the foreground maps of R2CNet. - You can also directly refer to the predictions R2CNet-Maps with access code
2023
on Baidu Netdisk.
4. Test.
- Assert that the pre-trained r2cnet.pth checkpoint file has been placed in './snapshot/saved_models/'.
- Run
python test.py
to evaluate the performance of R2CNet.
5. Ref-COD Benchmark Results.
Tab. 1. Comparison of the COD models with their Ref-COD counterparts. All models are evaluated on a NVIDIA RTX 3090 GPU. ‘R-50’: ResNet-50 [82],
‘E-B4’: EfficientNet-B4 [86], ‘R2-50’: Res2Net-50 [87], ‘R3
-50’: Triple ResNet-50 [2]. ‘-Ref’: the model with image references composed of salient
objects. ‘Attribute’: the attribute of each network, ‘Single-obj’: the scene of a single camouflaged object, ‘Multi-obj’: the scene of multiple
camouflaged objects, ‘Overall’: all scenes containing camouflaged objects.
This repo is mainly built based on SINet-V2, PFENet and MethodsCmp. Thanks for their great work!