PointPillars inference with TensorRT

Last update: Dec 31, 2022

Related tags

Overview

PointPillars inference with TensorRT

This repository contains sources and model for pointpillars inference using TensorRT. The model is created by OpenPCDet and modified by onnx_graphsurgeon.

Inference has four parts: generateVoxels: convert points cloud into voxels which has 4 channles generateFeatures: convert voxels into feature maps which has 10 channles Inference: convert feature maps to raw data of bounding box, class source and direction Postprocessing: parse bounding box, class source and direction

Data

The demo use the data from KITTI Dataset and more data can be downloaded following the linker GETTING_STARTED

Model

The onnx file can be converted from a model trainned by OpenPCDet with the tool in the demo.

Build

Prerequisites

To build the pointpillars inference, TensorRT with PillarScatter layer and CUDA are needed. PillarScatter layer plugin is already implemented as a plugin for TRT in the demo.

Jetpack 4.5
TensorRT v7.1.3
CUDA-10.2 + cuDNN-8.0.0
PCL is optinal to store pcd pointcloud file

Compile

$ cd test
$ mkdir build
$ cd build
$ make -j$(nproc)

Run

$ ./demo

Enviroments

Jetpack 4.5
Cuda10.2 + cuDNN8.0.0 + TensorRT 7.1.3
Nvidia Jetson AGX Xavier

Performance

FP16

|                   | GPU/ms | 
| ----------------- | ------ |
| generateVoxels    | 0.22   |
| generateFeatures  | 0.21   |
| Inference         | 30.75  |
| Postprocessing    | 3.19   |

Note

GPU processes all points at the same time and points selected form points cloud for a voxel randomly, so the output of generateVoxels has random value. Because CPU will select the first 32 points, the output of generateVoxels by CPU has fixed value.
The demo will cache the onnx file to improve performance. If a new onnx will be used, please remove the cache file in "./model"
MAX_VOXELS in params.h is used to allocate cache during inference. Decrease the value to save memory.

PointPillars inference with TensorRT

Related tags

Overview

PointPillars inference with TensorRT

Data

Model

Build

Prerequisites

Compile

Run

Enviroments

Performance

Note

References

Owner

NVIDIA AI IOT

Implementation of Nalbach et al. 2017 paper.

NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data.

Face and Pose detector that emits MQTT events when a face or human body is detected and not detected.

MultiMix: Sparingly Supervised, Extreme Multitask Learning From Medical Images (ISBI 2021, MELBA 2021)

基于Paddle框架的arcface复现

This is the 3D Implementation of 《Inconsistency-aware Uncertainty Estimation for Semi-supervised Medical Image Segmentation》

PyTorch Code for "Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning"

Action Recognition for Self-Driving Cars

Permeability Prediction Via Multi Scale 3D CNN

This project is based on our SIGGRAPH 2021 paper, ROSEFusion: Random Optimization for Online DenSE Reconstruction under Fast Camera Motion .

Deep Ensemble Learning with Jet-Like architecture

1st ranked 'driver careless behavior detection' for AI Online Competition 2021, hosted by MSIT Korea.

TJU Deep Learning & Neural Network

Standalone pre-training recipe with JAX+Flax

Minimalist Error collection Service compatible with Rollbar clients. Sentry or Rollbar alternative.

Automatic deep learning for image classification.

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer

Tensorflow/Keras Plug-N-Play Deep Learning Models Compilation