This repository contains code, experimental data and oxford-unknown dataset for the work as described in
4D Generic Video Object Proposals (https://arxiv.org/pdf/1901.09260.pdf)
By Aljosa Osep, Paul Voigtlaender, Mark Weber, Jonathon Luiten, Bastian Leibe, Computer Vision Group, RWTH Aachen University
- Upload sequences surrounding labeled frames of Oxford Dataset
- Upload result files
- Make sure that the uploaded verson of the tracker and configs reproduce the paper results
- Detailed instructions
For the labeling process, we manually selected 150 images of the Oxford RobotCar dataset. The subset of images we labeled is available here.
Image sequences are available (temporal neighborhood of the annotated frames) here.
Additional data for sequences (precomputed proposals) are available here.
We labeled 1,494 bounding boxes (1,081 known, 413 unknown) covering the visible portions of objects (non-amodal) by clicking the extremal points.
Known labeled classes (those that overlap with the COCO classes) are car, person, bike and bus. In addition, we labeled several object classes that are not present in the COCO dataset, labeled as unknown objects. Most notable object classes in the unknown set are the following: signage, pole, stone road sign, traffic cone, street sign, rubbish bin, transformer, post box, booth and stroller.
Category | Car | Person | Bike | Bus | Unknown | All |
---|---|---|---|---|---|---|
#instances | 599 | 354 | 78 | 50 | 413 | 1494 |
Portion | 40.1% | 23.1% | 5.2% | 3.3% | 27.6% | 100% |
We evaluated the performance of several methods on both known and unknown splits, see our paper for the details. All results will be available for download soon.
Please find labels in $REPO/eval/oxford_labels
. Labels are stored using JSON format. To evaluate recall, use the script $REPO/eval/eval_single_image_proposals.py
. To see how to use this script, take a look at $REPO/eval/run_evaluation.sh
.
Further instructions and descripton of the label format will be avalible soon. Until then, we recommend to step through eval_single_image_proposals.py
script to understand the format.
- Proposal evaluation:
- Python 3.x (for running
eval_single_image_proposals.py
) - pycocotools
- matplotlib
- numpy
- Python 3.x (for running
- Tracking evaluation (in addition):
- Python 2.7 (required by tracking evaluation legacy scripts)
- munkres
- tabulate
In order to run the video object proposal generator code, your setup has to meet the following minimum requirements (tested versions in parentheses. Other versions might work, too):
- cmake (tested with 3.9.6, earlier versions should work too)
- GCC 5.4.0
- Libs:
- Eigen (3.x)
- Boost (1.55 or later)
- OpenCV (tested with 3.x, 4.x)
- PCL (tested on 1.8.0 and 1.9.x) (note: requires FLANN and VTK for the 3D visualizer)
Note: any other paths will do too, you will need to adapt for that in the $REPO/script/exec_tracker.sh
- Download KITTI tracking dataset and place it to
/home/${USER}/data/kitti_tracking
- Download precomputed segmentations we provide for KITTI tracking dataset, unzip to
/home/${USER}/data/kitti_tracking/preproc
- Clone this repo to
/home/${USER}/projects
mkdir build && cd build
cmake ..
make all
- Enter
$REPO/script/
- Execute
exec_tracker.sh
SEGM_INPUTS
- Specify which pre-computed segmentations to use -- Mask Proposal R-CNN (mprcnn_coco
; recommended), Sharpmask (sharpmask_coco
), Mask R-CNN fine-tuned on KITTI (mrcnn_tuned
)INF_MODEL
- Specify which model should be used for inference -4DGVT
(recommended) orCAMOT
.INPUT_CONF_THRESH
- Detection/proposal score threshold. In case it is set to0.8
or more, you will be only forwarding confident detections.MAX_NUM_PROPOSALS
- Max. proposals fed to track generator per frame. More proposals -> slower, higher recall. Not recommended to be set above 500.
-
Running CAMOT vs. 4DGVT
- TODO
-
Inputs to the tracker
- You can use our precomputed segmentations for KITTI
- Provide your own using (export per-frame segmentations to json, pass jsons to the tracker):
- Sharpmask repo
- Our Mask Proposal R-CNN (MP R-CNN) repo
- You can also use MaskX R-CNN, trained on 3K+ classes on Visual Genome dataset project page + code
-
External libraries
- The tracker ships the following external modules:
- libelas - disparity estimation (http://www.cvlibs.net/software/libelas/)
- libviso2 - egomotion estimation (http://www.cvlibs.net/software/libviso/)
- nlohman::json - json parser (https://github.com/nlohmann/json)
- maskApi - COCO mask API for C (https://github.com/cocodataset/cocoapi)
- The tracker ships the following external modules:
-
Additional remarks about CAMOT
- TODO
-
Run the tracker in
release
mode (otherwise, it will be slow).
If you have any issues or questions about this repository, please contact me at aljosa (dot) osep (at) tum.de
If you find this repository useful in your research, please cite:
@inproceedings{Osep18ICRA,
author = {O\v{s}ep, Aljo\v{s}a and Mehner, Wolfgang and Voigtlaender, Paul and Leibe, Bastian},
title = {Track, then Decide: Category-Agnostic Vision-based Multi-Object Tracking},
booktitle = {ICRA},
year = {2018}
}
@inproceedings{Osep19ICRA,
author = {O\v{s}ep, Aljo\v{s}a and Voigtlaender, Paul and Weber, Mark and Luiten, Jonathon and Leibe, Bastian},
title = {4D Generic Video Object Proposals},
booktitle = ICRA,
year = {2020}
}
When using oxford-unknown labels, please cite the original dataset:
@article{Maddern17IJRR,
Author = {Will Maddern and Geoff Pascoe and Chris Linegar and Paul Newman},
Title = {{1 Year, 1000km: The Oxford RobotCar Dataset}},
Journal = {The International Journal of Robotics Research (IJRR)},
Volume = {36},
Number = {1},
Pages = {3-15},
Year = {2017}
}
- In case you want to use self-compiled libs, you may need to specify these paths (e.g., edit CMake cache or use
ccmake
):PCL_DIR
,OpenCV_DIR
,BOOST_ROOT
CMake Error Unable to find the requested Boost libraries. Unable to find the Boost header files. Please set BOOST_ROOT to the root directory containing Boost or BOOST_INCLUDEDIR to the directory containing Boost's headers.
For certain combinations of boost and CMake versions, it may happen CMake will not find all dependencies. Typically this will happen when using newer boost and older CMake; try using the most recent CMake to avoid this issue.- I had issues compiling PCL with VTK 9.x, recommending to use VTK 8.x.
GNU General Public License (http://www.gnu.org/licenses/gpl.html)
Copyright (c) 2017 Aljosa Osep Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.