This repo is being deprecated: Please use https://github.com/levipereira/deepstream-yolo-e2e
This project was developed using DeepStream SDK 7.0.
DeepStream 7.0 is now supported on Windows WSL2, which greatly aids in application development.
This project combines the power of DeepStream 7, the latest and most advanced real-time video analytics platform, with the precision and efficiency of YOLOv9, the cutting-edge in object detection and instance segmentation.
With DeepStream 7, we unlock the full potential of real-time video processing, providing an unparalleled video analytics experience.
YOLOv9 signifies a monumental leap forward in real-time object detection, introducing revolutionary methodologies like Programmable Gradient Information (PGI) and the Generalized Efficient Layer Aggregation Network (GELAN). This cutting-edge model showcases extraordinary enhancements in efficiency, accuracy, and adaptability, establishing unprecedented benchmarks on the MS COCO dataset.
This repo support Object Detection and Instance Segmentation
This project involves several important steps as outlined below:
git clone https://github.com/levipereira/deepstream-yolov9.git
cd deepstream-yolov9
git submodule update --init --recursive
Choose one option:
-
Download Models YOLOv9-C Detection/Segmentation models pre-trained on the COCO Dataset are available in this repository, exported in ONNX format.
cd models ./download_models.sh cd ..
Model Test Size APval AP50val AP75val Param. FLOPs YOLOv9-T 640 38.3% 53.1% 41.3% 2.0M 7.7G YOLOv9-S 640 46.8% 63.4% 50.7% 7.1M 26.4G YOLOv9-M 640 51.4% 68.1% 56.1% 20.0M 76.3G YOLOv9-C 640 53.0% 70.2% 57.8% 25.3M 102.1G Model Test Size Param. FLOPs APbox APmask YOLOv9-C-SEG 640 27.4M 145.5G 53.3% 43.5% -
You can export your own custom YOLOv9 models to ONNX
Download or Build TensorRT lib libnvinfer_plugin.so.8.6.1
with custom TensorRT EfficientNMSX plugin.
The EfficientNMSX plugin is customized, being a modified version of the EfficientNMS plugin, with the addition of a layer called det_indices. The EfficientNMSX plugin needs to be compiled, or you can use a precompiled version provided, which should be installed.
Choose one option:
- Download
cd TensorRTPlugin wget https://github.com/levipereira/deepstream-yolov9/releases/download/v1.0/libnvinfer_plugin.so.8.6.1 cd ..
- Build Plugin from source code TensorRTPlugin (This can take a long time)
sudo docker pull nvcr.io/nvidia/deepstream:7.0-triton-multiarch
Start the docker container from deepstream-yolov9
dir:
sudo docker run \
-it \
--privileged \
--rm \
--name=deepstream_yolov9 \
--net=host \
--gpus all \
-e DISPLAY=$DISPLAY \
-e CUDA_CACHE_DISABLE=0 \
--device /dev/snd \
-v /tmp/.X11-unix/:/tmp/.X11-unix \
-v `pwd`:/apps/deepstream-yolov9 \
-w /apps/deepstream-yolov9 \
nvcr.io/nvidia/deepstream:7.0-triton-multiarch
4. Install libnvinfer_plugin
with plugin TRT_EfficientNMSX (Required Only for Instance Segmentation Models)
cd TensorRTPlugin
./patch_libnvinfer.sh
cd ..
CUDA_VER=12.2 make -C nvdsinfer_yolo
## Detection
deepstream-app -c deepstream_yolov9_det.txt
## Segmentation
deepstream-app -c deepstream_yolov9_mask.txt
The first run may take up to 15 minutes due to the building Engine File with FP16 precision.
During this process, it may seem like it's stuck on the following line.
WARNING: [TRT]: onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
Please be patient and wait for it to complete.
This implementation supports dynamic shapes and dynamic batch sizes. To modify these settings, change the following configurations:
config_pgie_yolo9_det.txt
config_pgie_yolov9_mask.txt
batch-size=1
infer-dims=3;640;640
This also can be used to Perfomance Tests
This will avoid to create TRT Engine File on each execution.
Important: This step can take long time around ~15min per Model. Note: The model was exported with Dynamic Batch and Size, you can change it.
Optional flags:
-b
-- batch_size (default is 1)-n
-- network_size (default is 640)-p
-- precision fp32/fp16/int8 (default fp32)
cd models
./build_engine.sh
cd ..
Change in config_pgie files accordingly
config_pgie_yolo9_det.txt
config_pgie_yolov9_mask.txt
batch-size=1
infer-dims=3;640;640
# 0: FP32 1: INT8 2: FP16
network-mode=0