Skip to content

Latest commit

 

History

History
118 lines (87 loc) · 6.17 KB

README.md

File metadata and controls

118 lines (87 loc) · 6.17 KB

IEEE PRs Welcome

T-IP-2023: Rethinking Cross-Domain Pedestrian Detection: A Background-Focused Distribution Alignment Framework for Instance-Free One-Stage Detectors

Abstract

Cross-domain pedestrian detection aims to generalize pedestrian detectors from one label-rich domain to another label-scarce domain, which is crucial for various real-world applications. Most recent works focus on domain alignment to train domain-adaptive detectors either at the instance level or image level. From a practical point of view, one-stage detectors are faster. Therefore, we concentrate on designing a cross-domain algorithm for rapid one-stage detectors that lacks instance-level proposals and can only perform image-level feature alignment. However, pure image-level feature alignment causes the foreground-background misalignment issue to arise, i.e., the foreground features in the source domain image are falsely aligned with background features in the target domain image. To address this issue, we systematically analyze the importance of foreground and background in image-level cross-domain alignment, and learn that background plays a more critical role in image-level cross-domain alignment. Therefore, we focus on cross-domain background feature alignment while minimizing the influence of foreground features on the cross-domain alignment stage. This paper proposes a novel framework, namely, background-focused distribution alignment (BFDA), to train domain adaptive onestage pedestrian detectors. Specifically, BFDA first decouples the background features from the whole image feature maps and then aligns them via a novel long-short-range discriminator. Extensive experiments demonstrate that compared to mainstream domain adaptation technologies, BFDA significantly enhances cross-domain pedestrian detection performance for either one-stage or two-stage detectors. Moreover, by employing the efficient one-stage detector (YOLOv5), BFDA can reach 217.4 FPS (640×480 pixels) on NVIDIA Tesla V100 (7∼12 times the FPS of the existing frameworks), which is highly significant for practical applications.

Datasets

The primary datasets employed in this paper consist of Cityscapes, Caltech, and Foggycityscapes. Below, we present the Cityscapes and Caltech datasets used in this study:

For Foggycityscapes, we recommend adopting the identical file structure as that of Cityscapes and utilizing the same label scheme as applied in Cityscapes. Specifically, please organize it into the following format:

Foggycityscapes
  - images
    - train_02
    - train_01
    - train_005
    - val_02
    - val_01
    - val_005
  - labels
    - train_02
    - train_01
    - train_005
    - val_02
    - val_01
    - val_005

Usage

Conda environment

Clone repo and install requirements.txt in a Python>=3.8.0 environment, including PyTorch>=1.7.

git clone https://github.com/caiyancheng/BFDA.git  # clone
cd BFDA
pip install -r requirements.txt  # install
Download the YOLOv5 pre-trained models

Due to the continuous iteration of the original YOLOv5 repo, the pre-trained weights used by the BFDA framework can be downloaded here: YOLOv5 pre-trained models. Please place the downloaded weight file in the BFDA root directory.

Set data path

After cloning this repository and downloading the pre-trained weights, please create a '[your_data_set].yaml' file in the './data' directory. Mimic the format of the other YAML files in this path.

Set hyperparameters

Find "parser.add_argument" in each python file when you need to run the py file and set the internal hyperparameters. Hyperparameters in hyp.scratch.yaml can also be modified.

Train source domain weights (source)

Taking Cityscapes -> Caltech as an example, start by training YOLOv5 detection weights on the source domain, Cityscapes.

python train_city_tip.py # -- hyperparameters
Cross-domain training

Load the best weight trained in the previous step and perform cross-domain detection training.

python train_UDA_city2caltech_BFDA_Full.py # -- New hyperparameters, worth trying more
  • Note that BFDA's adversarial learning strategy is sensitive to hyperparameters, so it's recommended to run multiple times with the same set of hyperparameters.

Citation

If you find this work helpful in your research, please cite.

@article{cai2023rethinking,
  title={Rethinking cross-domain pedestrian detection: a background-focused distribution alignment framework for instance-free one-stage detectors},
  author={Cai, Yancheng and Zhang, Bo and Li, Baopu and Chen, Tao and Yan, Hongliang and Zhang, Jingdong},
  journal={IEEE transactions on image processing},
  year={2023},
  publisher={IEEE}
}

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China under Grant 62071127 and Grant U1909207, in part by the Shanghai Natural Science Foundation under Grant 23ZR1402900, and in part by the Zhejiang Laboratory under Project 2021KH0AB05.

We also greatly acknowledge the authors of YOLOv5 for their open-source codes. Visit the following links to access more contributions of them.