In computer vision
, we have a lot of datasets that are used for different tasks (i.e, classification, detection, segmentation, tracking...). These datasets in a form of images and/or videos are used to train and test models.
Here we take a look at the statistics for datasets used for the 3 main tasks in computer vision (Detection
, Segmentation
, and Tracking
). While attempting to do so we provide converters to standardized formats and PyTorch data loader implementations for specific datasets.
We are going to look at the statistics of each dataset and perform a comparison in the end. To do this we will need to load each dataset and extract it's statistics programmatically. We will also need to visualize the statistics in a way that is easy to understand. Last but not least the statistics will be compared.
Task | Detection Based | Instance Segmentation | Multi Object Tracking | Video Instance Segmentation |
---|---|---|---|---|
Dataset |
|
|
|
|
For more details on how to use the repo, please refer to the docs
# Get REPO
#1. clone and setup up the repo
!git clone https://github.com/ozerlabs-proxy/vision-datasets-
stats.git
#2. cd into the repo
cd vision-datasets-stats
#3.
#we require conda to be installed
#alternatively you can any other env
conda env create -f environment.yaml
conda activate VisionStats
#4. follow along the notebooks
There is a number of stats about datasets that can be generated. These may vary depending on the task, for most we will derive the following:
- | Detection | Tracking |
---|---|---|
Stats |
|
|
We are open to contributions, if you have a dataset that you would like to add to the list, please do so by following the steps in the contribution guide.