ECCV 2024 | Poster and Presentation | Proceeding |
Acquiring fine-grained object detection annotations in unconstrained images is time-consuming, expensive, and prone to noise, especially in crowdsourcing scenarios. Most prior object detection methods assume accurate annotations; A few recent works have studied object detection with noisy crowdsourced annotations, with evaluation on distinct synthetic crowdsourced datasets of varying setups under artificial assumptions. To address these algorithmic limitations and evaluation inconsistency, we first propose a novel Bayesian Detector Combination (BDC) framework to more effectively train object detectors with noisy crowdsourced annotations, with the unique ability of automatically inferring the annotators' label qualities. Unlike previous approaches, BDC is model-agnostic, requires no prior knowledge of the annotators' skill level, and seamlessly integrates with existing object detection models. Due to the scarcity of real-world crowdsourced datasets, we introduce large synthetic datasets by simulating varying crowdsourcing scenarios. This allows consistent evaluation of different models at scale. Extensive experiments on both real and synthetic crowdsourced datasets show that BDC outperforms existing state-of-the-art methods, demonstrating its superiority in leveraging crowdsourced data for object detection.
Figure 1: Examples of ambiguous cases with noisy or incorrect annotations on (a - c) MS COCO, (d)(e) VinDr-CXR and (f) disaster response dataset.
Figure 2: Overall architecture of the proposed BDC. The process of updating the aggregator's parameter (blue arrows) and the object detector's parameters (red arrows) is repeated iteratively until convergence.
Disclaimer: The codes have only been tested on Ubuntu 18.04, Python 3.8.18 and Pytorch 1.13.1 CUDA 11.7, but they should work on environments with similar major versions.
- Install required libraries by running
Alternatively, install from conda yaml file with
conda create -n bdc python=3.8 conda activate bdc pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117 pip install -r requirements.txt
conda env create -f env.yaml
- Install optional libraries for EVA model
- Install Apex v22.12
# install xformers for eva02 https://github.com/facebookresearch/xformers/tree/v0.0.16 pip install xformers==0.0.16 # compile detectron2 cd models/eva python -m pip install -e . cd ../../ pip install mmcv-full==1.7.1 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13.0/index.html
- Download pretrained model weights (optional)
mkdir pretrained_weights cd pretrained_weights wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7_training.pt wget https://huggingface.co/Yuxin-CV/EVA-02/resolve/main/eva02/det/eva02_L_coco_det_sys_o365.pth
Download our synthesised crowdsourced annotations from here. Extract and place the annotations in data
directory.
Please download Pascal VOC, MSCOCO and VinDr-CXR and place the images in .data
directory following this structure:
.
└── .data/
├── MSCOCO/
│ ├── annotations
│ └── images
├── VINCXR/
│ ├── train
│ └── test
└── VOCdevkit/
└── VOC2007/
├── Annotations
└── JPEGImages
- Convert your annotations to YOLO labelling format (one txt file per image).
- Train annotations follows format: x1,y1,x2,y2,class_id,annotator_id
- Test annotations (if available, else create an empty text file for each test image) follows format: x1,y1,x2,y2,class_id
- Create a new yaml file and modify the settings in it following the examples in
data
folder.
Run the scripts with --help
argument to see arguments descriptions:
- Specify the aggregation method with
--crowd-aggregator
argument, available aggregators are inconfig
directory - Specify the project name and experiment name with
--project
and--name
arguments (experiment outputs are saved tooutputs/<project>/<name>
) - Specify the data yaml file with
--data
argument
Examples below train YoloV7, Faster R-CNN and EVA model on our simplest synthetic dataset: VOC-FULL
using our proposed BDC aggregation method
python train_yolov7.py --project crowd-annotations-voc --plots --batch-size 31 --weights ./pretrained_weights/yolov7_training.pt --crowd-aggregator config/bdc.yaml --name ann10_full-yolov7_pretrained-bdc --data data/voc_2007_ann10_full.yaml
python train_faster_rcnn.py --project crowd-annotations-voc --plots --batch-size 8 --crowd-aggregator config/bdc.yaml --name ann10_full-frcnn_pretrained-bdc --data data/voc_2007_ann10_full.yaml
python train_eva.py --project crowd-annotations-voc --plots --crowd-aggregator config/bdc.yaml --name ann10_full-eva_pretrained-bdc --data data/voc_2007_ann10_full.yaml
Examples below train YoloV7, Faster R-CNN and EVA model on our synthetic dataset COCO-MIX
using our proposed BDC aggregation method
python train_yolov7.py --cfg ./models/yolov7/config/yolov7_coco.yaml --epochs 20 --save-after 5 --save-interval 5 --project crowd-annotations-coco --plots --batch-size 31 --weights ./pretrained_weights/yolov7_training.pt --crowd-aggregator config/bdc.yaml --name ann1000_mix-yolov7_pretrained-bdc --data data/coco_ann1000_mix_disjoint.yaml
python train_faster_rcnn.py --project crowd-annotations-coco --plots --epochs 10 --save-after 5 --save-interval 1 --batch-size 8 --crowd-aggregator config/bdc.yaml --name ann1000_mix-frcnn_pretrained-bdc --data data/coco_ann1000_mix_disjoint.yaml
python train_eva.py --project crowd-annotations-coco --plots --epochs 10 --crowd-aggregator config/bdc.yaml --name ann1000_mix-eva_pretrained-bdc --data data/coco_ann1000_mix_disjoint.yaml
All experiments are logged to Tensorboard and WandB automatically. They can be disabled in the training codes.
The code to synthesise the synthetic datasets are implemented in Jupyter notebooks in notebook
directory. Please refer to the notebooks synthetic_*.ipynb
for more details.
Synthesisation code (with docstrings) can also be found in utils.crowd.synthetic_data.py
.
General steps to synthesise annotations are as follows:
- Prepare your custom dataset following instructions above
- Run
train_classification.py
to train the classification model andtest_classification.py
to obtain the confusion matrix for synthesising class labels - Run
train_rpn.py
to train the RPN model for synthesising bounding boxes - Follow the steps in the notebooks to synthesise annotations and modify according to your desired synthetic setting (changing the number of annotators, classification and RPN models, etc.)
If you find this work useful for your research, please cite our work as
@inproceedings{bdc2024tan,
title = {Bayesian Detector Combination for Object Detection with Crowdsourced Annotations},
author = {Tan, Zhi Qin and Isupova, Olga and Carneiro, Gustavo and Zhu, Xiatian and Li, Yunpeng},
booktitle = {Proc. Eur. Conf. Comput. Vis.},
pages = {329--346},
year = {2024},
address = {Milan, Italy},
}
Suggestions and opinions on this work (both positive and negative) are greatly welcomed. Please contact the authors by sending an email to
zhiqin1998 at hotmail.com
.
This work is developed based on the codebase of YOLOv7 and EVA. We thank the authors for releasing their source code and models.