MMDet to TensorRT

This project aims to convert the mmdetection model to TensorRT model end2end. Focus on object detection for now. Mask support is experiment.

support:

fp16
int8(experiment)
batched input
dynamic input shape
combination of different modules
deepstream support

Any advices, bug reports and stars are welcome.

License

This project is released under the Apache 2.0 license.

Requirement

install mmdetection:

# mim is so cool!
pip install openmim
mim install mmdet==2.14.0

install torch2trt_dynamic:

git clone https://github.com/grimoire/torch2trt_dynamic.git torch2trt_dynamic
cd torch2trt_dynamic
python setup.py develop

install amirstan_plugin:

Install tensorrt: TensorRT

clone repo and build plugin

git clone --depth=1 https://github.com/grimoire/amirstan_plugin.git
cd amirstan_plugin
git submodule update --init --progress --depth=1
mkdir build
cd build
cmake -DTENSORRT_DIR=${TENSORRT_DIR} ..
make -j10

DON'T FORGET setting the envoirment variable(in ~/.bashrc):

export AMIRSTAN_LIBRARY_PATH=${amirstan_plugin_root}/build/lib

Installation

Host

git clone https://github.com/grimoire/mmdetection-to-tensorrt.git
cd mmdetection-to-tensorrt
python setup.py develop

Docker

Build docker image

# cuda11.1 TensorRT7.2.2 pytorch1.8 cuda11.1
sudo docker build -t mmdet2trt_docker:v1.0 docker/

You can also specify CUDA, Pytorch and Torchvision versions with docker build args by:

# cuda11.1 tensorrt7.2.2 pytorch1.6 cuda10.2
sudo docker build -t mmdet2trt_docker:v1.0 --build-arg TORCH_VERSION=1.6.0 --build-arg TORCHVISION_VERSION=0.7.0 --build-arg CUDA=10.2 --docker/

Run (will show the help for the CLI entrypoint)

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} mmdet2trt_docker:v1.0

Or if you want to open a terminal inside de container:

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} --entrypoint bash mmdet2trt_docker:v1.0

Example conversion:

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} mmdet2trt_docker:v1.0 ${bind_path}/config.py ${bind_path}/checkpoint.pth ${bind_path}/output.trt

Usage

how to create a TensorRT model from mmdet model (converting might take few minutes)(Might have some warning when converting.) detail can be found in getting_started.md

CLI

mmdet2trt ${CONFIG_PATH} ${CHECKPOINT_PATH} ${OUTPUT_PATH}

Run mmdet2trt -h for help on optional arguments.

Python

opt_shape_param=[
    [
        [1,3,320,320],      # min shape
        [1,3,800,1344],     # optimize shape
        [1,3,1344,1344],    # max shape
    ]
]
max_workspace_size=1<<30    # some module and tactic need large workspace.
trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size)

# save converted model
torch.save(trt_model.state_dict(), save_model_path)

# save engine if you want to use it in c++ api
with open(save_engine_path, mode='wb') as f:
    f.write(trt_model.state_dict()['engine'])

Note:

The input of the engine is the tensor after preprocess.
The output of the engine is num_dets, bboxes, scores, class_ids. if you enable the enable_mask flag, there will be another output mask.
The bboxes output of the engine did not divided by scale factor.

how to use the converted model

from mmdet.apis import inference_detector
from mmdet2trt.apis import create_wrap_detector

# create wrap detector
trt_detector = create_wrap_detector(trt_model, cfg_path, device_id)

# result share same format as mmdetection
result = inference_detector(trt_detector, image_path)

# visualize
trt_detector.show_result(
    image_path,
    result,
    score_thr=score_thr,
    win_name='mmdet2trt',
    show=True)

Try demo in demo/inference.py, or demo/cpp if you want to do inference with c++ api.

Read getting_started.md for more details.

How does it works?

Most other project use pytorch=>ONNX=>tensorRT route, This repo convert pytorch=>tensorRT directly, avoid unnecessary ONNX IR. Read how-does-it-work for detail.

Support Model/Module

Tested on:

torch=1.8.1
tensorrt=8.0.1.6
mmdetection=2.18.0
cuda=11.1

If you find any error, please report it in the issue.

FAQ

read this page if you meet any problem.

Contact

This repo is maintained by @grimoire

Discuss group: QQ:1107959378

And send your resume to my e-mail if you want to join @OpenMMLab. Please read the JD for detail: link

Convert openmmlab (not only mmdetection) series model to tensorrt

Related tags

Overview

MMDet to TensorRT

License

Requirement

Installation

Host

Docker

Usage

CLI

Python

How does it works?

Support Model/Module

FAQ

Contact

Owner

JinTian

Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Using deep learning to predict gene structures of the coding genes in DNA sequences of Arabidopsis thaliana

《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)

Image Restoration Using Swin Transformer for VapourSynth

Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

Temporal Knowledge Graph Reasoning Triggered by Memories

PyTorch Implementation of PIXOR: Real-time 3D Object Detection from Point Clouds

Neural Oblivious Decision Ensembles

The Habitat-Matterport 3D Research Dataset - the largest-ever dataset of 3D indoor spaces.

Open source code for the paper of Neural Sparse Voxel Fields.

Jittor implementation of PCT:Point Cloud Transformer

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

pytorch implementation of dftd2 & dftd3

Embeds a story into a music playlist by sorting the playlist so that the order of the music follows a narrative arc.

This repository contains the official code of the paper Equivariant Subgraph Aggregation Networks (ICLR 2022)

Deep learning library featuring a higher-level API for TensorFlow.

Deep Learning Based Fasion Recommendation System for Ecommerce