Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images
git clone https://github.com/XavierJiezou/Cloud-Adapter.git
cd Cloud-Adapter
You can either set up the environment manually or use our pre-configured environment for convenience:
Ensure you are using Python 3.8 or higher, then install the required dependencies:
pip install -r requirements.txt
We provide a pre-configured environment (envs
) hosted on Hugging Face. You can download it directly from Hugging Face. Follow the instructions on the page to set up and activate the environment.
We have open-sourced all datasets used in the paper, which are hosted on Hugging Face Datasets. Please follow the instructions on the dataset page to download the data.
After downloading, organize the dataset as follows:
Cloud-Adapter
├── ...
├── data
│ ├── cloudsen12_high_l1c
│ │ ├── ann_dir
│ │ │ ├── train
│ │ │ ├── val
│ │ │ ├── test
│ │ ├── img_dir
│ │ │ ├── train
│ │ │ ├── val
│ │ │ ├── test
│ ├── cloudsen12_high_l2a
│ │ ├── ann_dir
│ │ │ ├── train
│ │ │ ├── val
│ │ │ ├── test
│ │ ├── img_dir
│ │ │ ├── train
│ │ │ ├── val
│ │ │ ├── test
│ ├── gf12ms_whu_gf1
│ │ ├── ann_dir
│ │ │ ├── train
│ │ │ ├── val
│ │ │ ├── test
│ │ ├── img_dir
│ │ │ ├── train
│ │ │ ├── val
│ │ │ ├── test
│ ├── gf12ms_whu_gf2
│ │ ├── ann_dir
│ │ │ ├── train
│ │ │ ├── val
│ │ │ ├── test
│ │ ├── img_dir
│ │ │ ├── train
│ │ │ ├── val
│ │ │ ├── test
│ ├── hrc_whu
│ │ ├── ann_dir
│ │ │ ├── train
│ │ │ ├── val
│ │ │ ├── test
│ │ ├── img_dir
│ │ │ ├── train
│ │ │ ├── val
│ │ │ ├── test
├── ...
All model weights used in the paper have been open-sourced and are available on Hugging Face Models. You can download the pretrained models and directly integrate them into your pipeline.
To use a pretrained model, specify the path to the downloaded weights in your configuration file or command-line arguments.
We utilize the MMSegmentation framework for training. Please ensure you have the MMSegmentation library installed and the configuration file properly set up.
Update the configs
directory with your training configuration, or use one of the provided example configurations. You can customize the backbone, dataset paths, and hyperparameters in the configuration file (e.g., configs/adapter/cloud_adapter_pmaa_convnext_lora_16_adapter_all.py
).
Use the following command to begin training:
CUDA_VISIBLE_DEVICES=0 python tools/train.py configs/adapter/cloud_adapter_pmaa_convnext_lora_16_adapter_all.py
To resume training from a checkpoint or fine-tune using pretrained weights, run:
python tools/train.py configs/adapter/cloud_adapter_pmaa_convnext_lora_16_adapter_all.py --resume-from path/to/checkpoint.pth
Use the following command to evaluate the trained model:
CUDA_VISIBLE_DEVICES=0 python tools/test.py configs/adapter/cloud_adapter_pmaa_convnext_lora_16_adapter_all.py path/to/checkpoint.pth
If you want to evaluate the model’s performance on different scenes of the L8_Biome dataset, you can run the following script:
python tools/eval_l8_scene.py --config configs/to/path.py --checkpoint path/to/checkpoint.pth --img_dir data/l8_biome
This will automatically evaluate the model across various scenes of the L8_Biome dataset, providing detailed performance metrics for each scene.
If you would like to reproduce the other models and comparisons presented in the paper, please refer to our other repository: CloudSeg. This repository contains the implementation and weights of the other models used for comparison in the study.
We have published the pre-trained model's visualization results of various datasets on Hugging Face at Hugging Face. If you prefer not to run the code, you can directly visit the repository to download the visualization results.
We have created a Gradio demo to showcase the model's functionality. If you'd like to try it out, follow these steps:
- Navigate to the
hugging_face
directory:
cd hugging_face
- Run the demo:
python app.py
This will start the Gradio interface, where you can upload remote sensing images and visualize the model's segmentation results in real-time.
-
If you encounter a
file not found
error, it is likely that the model weights have not been downloaded. Please visit Hugging Face Models to download the pretrained model weights. -
GPU Requirements: To run the model on a GPU, you will need at least 16GB of GPU memory.
-
Running on CPU: If you prefer to run the demo on CPU instead of GPU, set the following environment variable before running the demo:
export CUDA_VISIBLE_DEVICES=-1
If you use our code or models in your research, please cite with:
@misc{cloud-adapter,
title={Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images},
author={Xuechao Zou and Shun Zhang and Kai Li and Shiying Wang and Junliang Xing and Lei Jin and Congyan Lang and Pin Tao},
year={2024},
eprint={2411.13127},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.13127},
}