BEVDet implemented by TensorRT, C++

NEWS: The branch bevdet_vendor-ros2 is a ROS 2 version modified based on the branch one. This branch organizes the TensorRT inference code of BEVDet into the ROS 2 package called bevdet_vendor. After compilation, it generates the bevdet_vendor library, and users can create their own ROS 2 node to call the bevdet_vendor library for image inference. How to use the bevdet_vendor library can refer to Autoware BEVDet inference node autoware_tensorrt_bevdet.

This project is a TensorRT implementation for BEVDet inference, written in C++. It can be tested on the nuScenes dataset and also provides a single test sample. BEVDet is a multi-camera 3D object detection model in bird's-eye view. For more details about BEVDet, please refer to the following link BEVDet. The script to export the ONNX model is in this repository.

This project implements the following:

TensorRT-Plugins : AlignBEV_plugin, Preprocess_plugin, BEVPool_plugin, GatherBEV_plugin
Long-term model
BEV-Depth model
On the NVIDIA A4000, the BEVDet-r50-lt-depth model shows a 6.24x faster inference speed for TRT FP16 compared to PyTorch FP32
On the Jetson AGX Orin, the FP16 model inference time is around 27 ms, achieving real-time performance
A Dataloader for the nuScenes dataset and can be used to test on the dataset
Fine-tuned the model to solve the problem that the model is sensitive to input resize sampling, which leads to the decline of mAP and NDS
An Attempt at Int8 Quantization

The features of this project are as follows:

A CUDA Kernel that combines Resize, Crop, and Normalization for preprocessing
The Preprocess CUDA kernel includes two interpolation methods: Nearest Neighbor Interpolation and Bicubic Interpolation
Alignment of adjacent frame BEV features using C++ and CUDA kernel implementation
Multi-threading and multi-stream NvJPEG
Sacle-NMS
Remove the preprocess module in BEV encoder

The following parts need to be implemented:

Quantization to int8.
Integrate the bevpool and adjacent frame BEV feature alignment components into the engine as plugins
Exception handling

Results && Speed

Inference Speed

All time units are in milliseconds (ms), and Nearest interpolation is used by default.

	TRT-Engine	Postprocess	mean Total
NVIDIA A4000 PyTorch FP32	—	—	86.24
NVIDIA A4000 FP16	11.38	0.53	11.91
Jetson AGX Orin FP16	26.60	0.99	27.60

DataSet

The Project provides a test sample that can also be used for inference on the nuScenes dataset. When testing on the nuScenes dataset, you need to use the data_infos folder provided by this project. The data folder should have the following structure:

└── data
    ├── nuscenes
        ├── data_infos
            ├── samples_infos
                ├── sample0000.yaml
                ├── sample0001.yaml
                ├── ...
            ├── samples_info.yaml
            ├── time_sequence.yaml
        ├── samples
        ├── sweeps
        ├── ...

the data_infos folder can be downloaded from Google drive or Baidu Netdisk

Environment

For desktop or server：

CUDA 11.8
cuDNN 8.6.0
TensorRT 8.5.2.2
yaml-cpp
Eigen3
libjpeg

For Jetson AGX Orin

Jetpack 5.1.1
CUDA 11.4.315
cuDNN 8.6.0
TensorRT 8.5.2.2
yaml-cpp
Eigen3
libjpeg

Compile && Run

Use the ONNX file to export the TRT engine based on the script:

mkdir build && cd build
cmake .. && make
./export model.onnx model.engine

Inference

./bevdemo ../configure.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
cfgs		cfgs
doc		doc
include		include
sample0		sample0
src		src
tools		tools
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
configure.yaml		configure.yaml
demo_bevdet.cpp		demo_bevdet.cpp
package.xml		package.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BEVDet implemented by TensorRT, C++

Results && Speed

Inference Speed

DataSet

Environment

Compile && Run

References

About

Releases

Packages

Languages

autowarefoundation/bevdet_vendor

Folders and files

Latest commit

History

Repository files navigation

BEVDet implemented by TensorRT, C++

Results && Speed

Inference Speed

DataSet

Environment

Compile && Run

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages