Convert openmmlab (not only mmdetection) series model to tensorrt

Related tags

Deep Learningmm2trt
Overview

MMDet to TensorRT

This project aims to convert the mmdetection model to TensorRT model end2end. Focus on object detection for now. Mask support is experiment.

support:

  • fp16
  • int8(experiment)
  • batched input
  • dynamic input shape
  • combination of different modules
  • deepstream support

Any advices, bug reports and stars are welcome.

License

This project is released under the Apache 2.0 license.

Requirement

  • install mmdetection:

    # mim is so cool!
    pip install openmim
    mim install mmdet==2.14.0
  • install torch2trt_dynamic:

    git clone https://github.com/grimoire/torch2trt_dynamic.git torch2trt_dynamic
    cd torch2trt_dynamic
    python setup.py develop
  • install amirstan_plugin:

    • Install tensorrt: TensorRT

    • clone repo and build plugin

      git clone --depth=1 https://github.com/grimoire/amirstan_plugin.git
      cd amirstan_plugin
      git submodule update --init --progress --depth=1
      mkdir build
      cd build
      cmake -DTENSORRT_DIR=${TENSORRT_DIR} ..
      make -j10
    • DON'T FORGET setting the envoirment variable(in ~/.bashrc):

      export AMIRSTAN_LIBRARY_PATH=${amirstan_plugin_root}/build/lib

Installation

Host

git clone https://github.com/grimoire/mmdetection-to-tensorrt.git
cd mmdetection-to-tensorrt
python setup.py develop

Docker

Build docker image

# cuda11.1 TensorRT7.2.2 pytorch1.8 cuda11.1
sudo docker build -t mmdet2trt_docker:v1.0 docker/

You can also specify CUDA, Pytorch and Torchvision versions with docker build args by:

# cuda11.1 tensorrt7.2.2 pytorch1.6 cuda10.2
sudo docker build -t mmdet2trt_docker:v1.0 --build-arg TORCH_VERSION=1.6.0 --build-arg TORCHVISION_VERSION=0.7.0 --build-arg CUDA=10.2 --docker/

Run (will show the help for the CLI entrypoint)

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} mmdet2trt_docker:v1.0

Or if you want to open a terminal inside de container:

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} --entrypoint bash mmdet2trt_docker:v1.0

Example conversion:

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} mmdet2trt_docker:v1.0 ${bind_path}/config.py ${bind_path}/checkpoint.pth ${bind_path}/output.trt

Usage

how to create a TensorRT model from mmdet model (converting might take few minutes)(Might have some warning when converting.) detail can be found in getting_started.md

CLI

mmdet2trt ${CONFIG_PATH} ${CHECKPOINT_PATH} ${OUTPUT_PATH}

Run mmdet2trt -h for help on optional arguments.

Python

opt_shape_param=[
    [
        [1,3,320,320],      # min shape
        [1,3,800,1344],     # optimize shape
        [1,3,1344,1344],    # max shape
    ]
]
max_workspace_size=1<<30    # some module and tactic need large workspace.
trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size)

# save converted model
torch.save(trt_model.state_dict(), save_model_path)

# save engine if you want to use it in c++ api
with open(save_engine_path, mode='wb') as f:
    f.write(trt_model.state_dict()['engine'])

Note:

  • The input of the engine is the tensor after preprocess.
  • The output of the engine is num_dets, bboxes, scores, class_ids. if you enable the enable_mask flag, there will be another output mask.
  • The bboxes output of the engine did not divided by scale factor.

how to use the converted model

from mmdet.apis import inference_detector
from mmdet2trt.apis import create_wrap_detector

# create wrap detector
trt_detector = create_wrap_detector(trt_model, cfg_path, device_id)

# result share same format as mmdetection
result = inference_detector(trt_detector, image_path)

# visualize
trt_detector.show_result(
    image_path,
    result,
    score_thr=score_thr,
    win_name='mmdet2trt',
    show=True)

Try demo in demo/inference.py, or demo/cpp if you want to do inference with c++ api.

Read getting_started.md for more details.

How does it works?

Most other project use pytorch=>ONNX=>tensorRT route, This repo convert pytorch=>tensorRT directly, avoid unnecessary ONNX IR. Read how-does-it-work for detail.

Support Model/Module

  • Faster R-CNN
  • Cascade R-CNN
  • Double-Head R-CNN
  • Group Normalization
  • Weight Standardization
  • DCN
  • SSD
  • RetinaNet
  • Libra R-CNN
  • FCOS
  • Fovea
  • CARAFE
  • FreeAnchor
  • RepPoints
  • NAS-FPN
  • ATSS
  • PAFPN
  • FSAF
  • GCNet
  • Guided Anchoring
  • Generalized Attention
  • Dynamic R-CNN
  • Hybrid Task Cascade
  • DetectoRS
  • Side-Aware Boundary Localization
  • YOLOv3
  • PAA
  • CornerNet(WIP)
  • Generalized Focal Loss
  • Grid RCNN
  • VFNet
  • GROIE
  • Mask R-CNN(experiment)
  • Cascade Mask R-CNN(experiment)
  • Cascade RPN
  • DETR
  • YOLOX

Tested on:

  • torch=1.8.1
  • tensorrt=8.0.1.6
  • mmdetection=2.18.0
  • cuda=11.1

If you find any error, please report it in the issue.

FAQ

read this page if you meet any problem.

Contact

This repo is maintained by @grimoire

Discuss group: QQ:1107959378

And send your resume to my e-mail if you want to join @OpenMMLab. Please read the JD for detail: link

Owner
JinTian
You know who I am.
JinTian
Code for the paper "Reinforced Active Learning for Image Segmentation"

Reinforced Active Learning for Image Segmentation (RALIS) Code for the paper Reinforced Active Learning for Image Segmentation Dependencies python 3.6

Arantxa Casanova 79 Dec 19, 2022
A toolset of Python programs for signal modeling and indentification via sparse semilinear autoregressors.

SPAAR Description A toolset of Python programs for signal modeling via sparse semilinear autoregressors. References Vides, F. (2021). Computing Semili

Fredy Vides 0 Oct 30, 2021
Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Depth-supervised NeRF: Fewer Views and Faster Training for Free Project | Paper | YouTube Pytorch implementation of our method for learning neural rad

524 Jan 08, 2023
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling For Official repo of NU-Wave: A Diffusion Probabilistic Model for Neural Audio Up

Rishikesh (ऋषिकेश) 38 Oct 11, 2022
This is a collection of all challenges in HKCERT CTF 2021

香港網絡保安新生代奪旗挑戰賽 2021 (HKCERT CTF 2021) This is a collection of all challenges (and writeups) in HKCERT CTF 2021 Challenges ID Chinese name Name Score S

10 Jan 27, 2022
A coin flip game in which you can put the amount of money below or equal to 1000 and then choose heads or tail

COIN_FLIPPY ##This is a simple example package. You can use Github-flavored Markdown to write your content. Coinflippy A coin flip game in which you c

2 Dec 26, 2021
Definition of a business problem according to Wilson Lower Bound Score and Time Based Average Rating

Wilson Lower Bound Score, Time Based Rating Average In this study I tried to calculate the product rating and sorting reviews more accurately. I have

3 Sep 30, 2021
Fast mesh denoising with data driven normal filtering using deep variational autoencoders

Fast mesh denoising with data driven normal filtering using deep variational autoencoders This is an implementation for the paper entitled "Fast mesh

9 Dec 02, 2022
DeepFashion2 is a comprehensive fashion dataset.

DeepFashion2 Dataset DeepFashion2 is a comprehensive fashion dataset. It contains 491K diverse images of 13 popular clothing categories from both comm

switchnorm 1.8k Jan 07, 2023
This repository contains the code and models for the following paper.

DC-ShadowNet Introduction This is an implementation of the following paper DC-ShadowNet: Single-Image Hard and Soft Shadow Removal Using Unsupervised

AuAgCu 65 Dec 27, 2022
source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT

LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval This repository contains source code and pre-trained/fine-tun

Siqi 65 Dec 26, 2022
Unofficial Implement PU-Transformer

PU-Transformer-pytorch Pytorch unofficial implementation of PU-Transformer (PU-Transformer: Point Cloud Upsampling Transformer) https://arxiv.org/abs/

Lee Hyung Jun 7 Sep 21, 2022
Bayesian Optimization using GPflow

Note: This package is for use with GPFlow 1. For Bayesian optimization using GPFlow 2 please see Trieste, a joint effort with Secondmind. GPflowOpt GP

GPflow 257 Dec 26, 2022
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

249 Dec 28, 2022
Generating Videos with Scene Dynamics

Generating Videos with Scene Dynamics This repository contains an implementation of Generating Videos with Scene Dynamics by Carl Vondrick, Hamed Pirs

Carl Vondrick 706 Jan 04, 2023
SimplEx - Explaining Latent Representations with a Corpus of Examples

SimplEx - Explaining Latent Representations with a Corpus of Examples Code Author: Jonathan Crabbé ( Jonathan Crabbé 14 Dec 15, 2022

Pytorch Lightning Distributed Accelerators using Ray

Distributed PyTorch Lightning Training on Ray This library adds new PyTorch Lightning accelerators for distributed training using the Ray distributed

166 Dec 27, 2022
PyTorch Implementation of Backbone of PicoDet

PicoDet-Backbone PyTorch Implementation of Backbone of PicoDet Original Implementation is implemented on PaddlePaddle. Example picodet_l_backbone = ES

Yonghye Kwon 7 Jul 12, 2022
ConvMixer unofficial implementation

ConvMixer ConvMixer 非官方实现 pytorch 版本已经实现。 nets 是重构版本 ,test 是官方代码 感兴趣小伙伴可以对照看一下。 keras 已经实现 tf2.x 中 是tensorflow 2 版本 gelu 激活函数要求 tf=2.4 否则使用入下代码代替gelu

Jian Tengfei 8 Jul 11, 2022
Learning Chinese Character style with conditional GAN

zi2zi: Master Chinese Calligraphy with Conditional Adversarial Networks Introduction Learning eastern asian language typefaces with GAN. zi2zi(字到字, me

Yuchen Tian 2.2k Jan 02, 2023