Collection of command line tools to assist with extracting, merging, and training datasets from a CVAT installation.
The basis of this code originates from the Tool for Painless Object Detection (OpenTPOD) developed by Junjue Wang.
We also pull in datumaro which is the backend used by CVAT which handles reading, writing, conversion, and merging of various dataset formats.
You can create a config file in your home directory named .opentpod-tools
with
common settings such as the CVAT installation base url, username, and password.
[cvat]
url = http://localhost:8080
username = user
password = pass
# set up a virtualenv with a newer pip
python3 -m venv venv
venv/bin/pip install --upgrade pip
venv/bin/pip install git+https://github.com/cmusatyalab/opentpod-tools.git
This is my first attempt at using Poetry to manage python package dependencies, so I may be doing everything wrong.
It should be possible to locally build this package as follows,
# install poetry, see https://python-poetry.org/docs/
# Make sure you install for python3
#
# I used (the not recommended way): pip3 install --user poetry
git clone https://github.com/cmusatyalab/opentpod-tools.git
cd opentpod-tools
poetry install
This will create a virtualenv with all the dependencies and installs
opentpod-tools in that virtualenv. You can start up a shell using the
installed virtualenv environment with poetry shell
and work from there.
The following assume that opentpod-tools
has been installed globally, or you
are running it from within a virtualenv (see poetry run
/poetry shell
).
Download, merge and cleanup datasets.
# upload videos to CVAT, and label them
# download labeled datasets
tpod-download [--project|--task|--job] <project/task/job ID> ...
# optionally merge multiple downloaded datasets
datum merge -o merged datumaro_task_N .. datumaro_task_M
# filter frames with no annotations (and optionally annotated occlusions)
tpod-filter [--filter-occluded] [-o filtered] merged
# remove similar image frames with tpod-unique
# options are:
# -m --method: sequential, only check against the last 'unique' image (= default)
# random, check against random subset of unique image list with [-r/--ratio]
# exhaustive, check each new image against all chosen unique images
# -t --threshold: the difference between current image and unique image(s), default = 10
tpod-unique [-m sequential|random|complete] [-o unique] filtered [-t 10 -r 0.7]
# split into training and validation subsets
datum transform -t random_split -o split unique -- -s train:0.9 -s val:0.1 [-s test:...]
Explore the dataset.
# high level information (# image in trainingset and evaluation set)
datum dinfo split
# detailed statistics (distribution of labels, area of labeled features, etc.)
datum stats split
To train a yolo object detector. Install (reinstall) opentpod-tools with the yolo training extra dependencies where you will do the training.
# when installing from the source tree with poetry
poetry install -E yolo
# when installing from the github repository with pip
pip install git+https://github.com/cmusatyalab/opentpod-tools.git#egg=opentpod-tools[yolo]
# or, just install the yolo dependency
pip install ultralytics
Then we can export our dataset to the proper format and train a yolo object detector.
# export to yolo_ultralytics format
datum convert -i split -f yolo_ultralytics -o yolo-dataset -- --save-media
# train an object detector model
yolo detect train data=$(pwd)/yolo-dataset/data.yaml model=yolov8n.pt epochs=100 imgsz=640 project=yolo-project