This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

Overview

PyTorch Infer Utils

This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

To install

git clone https://github.com/gorodnitskiy/pytorch_infer_utils.git
pip install /path/to/pytorch_infer_utils/

Export PyTorch model to ONNX

  • Check model for denormal weights to achieve better performance. Use load_weights_rounded_model func to load model with weights rounding:
    from pytorch_infer_utils import load_weights_rounded_model
    
    model = ModelClass()
    load_weights_rounded_model(
        model,
        "/path/to/model_state_dict",
        map_location=map_location
    )
    
  • Use ONNXExporter.torch2onnx method to export pytorch model to ONNX:
    from pytorch_infer_utils import ONNXExporter
    
    model = ModelClass()
    model.load_state_dict(
        torch.load("/path/to/model_state_dict", map_location=map_location)
    )
    model.eval()
    
    exporter = ONNXExporter()
    input_shapes = [-1, 3, 224, 224] # -1 means that is dynamic shape
    exporter.torch2onnx(model, "/path/to/model.onnx", input_shapes)
    
  • Use ONNXExporter.optimize_onnx method to optimize ONNX via onnxoptimizer:
    from pytorch_infer_utils import ONNXExporter
    
    exporter = ONNXExporter()
    exporter.optimize_onnx("/path/to/model.onnx", "/path/to/optimized_model.onnx")
    
  • Use ONNXExporter.optimize_onnx_sim method to optimize ONNX via onnx-simplifier. Be careful with onnx-simplifier not to lose dynamic shapes.
    from pytorch_infer_utils import ONNXExporter
    
    exporter = ONNXExporter()
    exporter.optimize_onnx_sim("/path/to/model.onnx", "/path/to/optimized_model.onnx")
    
  • Also, a method combined the above methods is available ONNXExporter.torch2optimized_onnx:
    from pytorch_infer_utils import ONNXExporter
    
    model = ModelClass()
    model.load_state_dict(
        torch.load("/path/to/model_state_dict", map_location=map_location)
    )
    model.eval()
    
    exporter = ONNXExporter()
    input_shapes = [-1, 3, -1, -1] # -1 means that is dynamic shape
    exporter.torch2optimized_onnx(model, "/path/to/model.onnx", input_shapes)
    
  • Other params that can be used in class initialization:
    • default_shapes: default shapes if dimension is dynamic, default = [1, 3, 224, 224]
    • onnx_export_params:
      • export_params: store the trained parameter weights inside the model file, default = True
      • do_constant_folding: whether to execute constant folding for optimization, default = True
      • input_names: the model's input names, default = ["input"]
      • output_names: the model's output names, default = ["output"]
      • opset_version: the ONNX version to export the model to, default = 11
    • onnx_optimize_params:
      • fixed_point: use fixed point, default = False
      • passes: optimization passes, default = [ "eliminate_deadend", "eliminate_duplicate_initializer", "eliminate_identity", "eliminate_if_with_const_cond", "eliminate_nop_cast", "eliminate_nop_dropout", "eliminate_nop_flatten", "eliminate_nop_monotone_argmax", "eliminate_nop_pad", "eliminate_nop_transpose", "eliminate_unused_initializer", "extract_constant_to_initializer", "fuse_add_bias_into_conv", "fuse_bn_into_conv", "fuse_consecutive_concats", "fuse_consecutive_log_softmax", "fuse_consecutive_reduce_unsqueeze", "fuse_consecutive_squeezes", "fuse_consecutive_transposes", "fuse_matmul_add_bias_into_gemm", "fuse_pad_into_conv", "fuse_transpose_into_gemm", "lift_lexical_references", "nop" ]

Export ONNX to TensorRT

  • Check TensorRT health via check_tensorrt_health func
  • Use TRTEngineBuilder.build_engine method to export ONNX to TensorRT:
    from pytorch_infer_utils import TRTEngineBuilder
    
    exporter = TRTEngineBuilder()
    # get engine by itself
    engine = exporter.build_engine("/path/to/model.onnx")
    # or save engine to /path/to/model.trt
    exporter.build_engine("/path/to/model.onnx", engine_path="/path/to/model.trt")
    
  • fp16_mode is available:
    from pytorch_infer_utils import TRTEngineBuilder
    
    exporter = TRTEngineBuilder()
    engine = exporter.build_engine("/path/to/model.onnx", fp16_mode=True)
    
  • int8_mode is available. It requires calibration_set of images as List[Any], load_image_func - func to correctly read and process images, max_image_shape - max image size as [C, H, W] to allocate correct size of memory:
    from pytorch_infer_utils import TRTEngineBuilder
    
    exporter = TRTEngineBuilder()
    engine = exporter.build_engine(
        "/path/to/model.onnx",
        int8_mode=True,
        calibration_set=calibration_set,
        max_image_shape=max_image_shape,
        load_image_func=load_image_func,
    )
    
  • Also, additional params for builder config builder.create_builder_config can be put to kwargs.
  • Other params that can be used in class initialization:
    • opt_shape_dict: optimal shapes, default = {'input': [[1, 3, 224, 224], [1, 3, 224, 224], [1, 3, 224, 224]]}
    • max_workspace_size: max workspace size, default = [1, 30]
    • stream_batch_size: batch size for forward network during transferring to int8, default = 100
    • cache_file: int8_mode cache filename, default = "model.trt.int8calibration"

Inference via onnxruntime on CPU and onnx_tensort on GPU

  • Base class ONNXWrapper __init__ has the structure as below:
    def __init__(
        self,
        onnx_path: str,
        gpu_device_id: Optional[int] = None,
        intra_op_num_threads: Optional[int] = 0,
        inter_op_num_threads: Optional[int] = 0,
    ) -> None:
        """
        :param onnx_path: onnx-file path, required
        :param gpu_device_id: gpu device id to use, default = 0
        :param intra_op_num_threads: ort_session_options.intra_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :param inter_op_num_threads: ort_session_options.inter_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :type onnx_path: str
        :type gpu_device_id: int
        :type intra_op_num_threads: int
        :type inter_op_num_threads: int
        """
        if gpu_device_id is None:
            import onnxruntime
    
            self.is_using_tensorrt = False
            ort_session_options = onnxruntime.SessionOptions()
            ort_session_options.intra_op_num_threads = intra_op_num_threads
            ort_session_options.inter_op_num_threads = inter_op_num_threads
            self.ort_session = onnxruntime.InferenceSession(
                onnx_path, ort_session_options
            )
    
        else:
            import onnx
            import onnx_tensorrt.backend as backend
    
            self.is_using_tensorrt = True
            model_proto = onnx.load(onnx_path)
            for gr_input in model_proto.graph.input:
                gr_input.type.tensor_type.shape.dim[0].dim_value = 1
    
            self.engine = backend.prepare(
                model_proto, device=f"CUDA:{gpu_device_id}"
            )
    
  • ONNXWrapper.run method assumes the use of such a structure:
    img = self._process_img_(img)
    if self.is_using_tensorrt:
        preds = self.engine.run(img)
    else:
        ort_inputs = {self.ort_session.get_inputs()[0].name: img}
        preds = self.ort_session.run(None, ort_inputs)
    
    preds = self._process_preds_(preds)
    

Inference via onnxruntime on CPU and TensorRT on GPU

  • Base class TRTWrapper __init__ has the structure as below:
    def __init__(
        self,
        onnx_path: Optional[str] = None,
        trt_path: Optional[str] = None,
        gpu_device_id: Optional[int] = None,
        intra_op_num_threads: Optional[int] = 0,
        inter_op_num_threads: Optional[int] = 0,
        fp16_mode: bool = False,
    ) -> None:
        """
        :param onnx_path: onnx-file path, default = None
        :param trt_path: onnx-file path, default = None
        :param gpu_device_id: gpu device id to use, default = 0
        :param intra_op_num_threads: ort_session_options.intra_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :param inter_op_num_threads: ort_session_options.inter_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :param fp16_mode: use fp16_mode if class initializes only with
            onnx_path on GPU, default = False
        :type onnx_path: str
        :type trt_path: str
        :type gpu_device_id: int
        :type intra_op_num_threads: int
        :type inter_op_num_threads: int
        :type fp16_mode: bool
        """
        if gpu_device_id is None:
            import onnxruntime
    
            self.is_using_tensorrt = False
            ort_session_options = onnxruntime.SessionOptions()
            ort_session_options.intra_op_num_threads = intra_op_num_threads
            ort_session_options.inter_op_num_threads = inter_op_num_threads
            self.ort_session = onnxruntime.InferenceSession(
                onnx_path, ort_session_options
            )
    
        else:
            self.is_using_tensorrt = True
            if trt_path is None:
                builder = TRTEngineBuilder()
                trt_path = builder.build_engine(onnx_path, fp16_mode=fp16_mode)
    
            self.trt_session = TRTRunWrapper(trt_path)
    
  • TRTWrapper.run method assumes the use of such a structure:
    img = self._process_img_(img)
    if self.is_using_tensorrt:
        preds = self.trt_session.run(img)
    else:
        ort_inputs = {self.ort_session.get_inputs()[0].name: img}
        preds = self.ort_session.run(None, ort_inputs)
    
    preds = self._process_preds_(preds)
    

Environment

TensorRT

  • TensorRT installing guide is here
  • Required CUDA-Runtime, CUDA-ToolKit
  • Also, required additional python packages not included to setup.cfg (it depends upon CUDA environment version):
    • pycuda
    • nvidia-tensorrt
    • nvidia-pyindex

onnx_tensorrt

  • onnx_tensorrt requires cuda-runtime and tensorrt.
  • To install:
    git clone --depth 1 --branch 21.02 https://github.com/onnx/onnx-tensorrt.git
    cd onnx-tensorrt
    cp -r onnx_tensorrt /usr/local/lib/python3.8/dist-packages
    cd ..
    rm -rf onnx-tensorrt
    
Owner
Alex Gorodnitskiy
Computer Vision Engineer 🤖
Alex Gorodnitskiy
Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking."

Expert-Linking Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking." This is

BoChen 12 Jan 01, 2023
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to defend and evaluate Machine Learning models and ap

3.4k Jan 04, 2023
ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for which no expressive speech corpus is available.

ERISHA: Multilingual Multispeaker Expressive Text-to-Speech Library ERISHA is a multilingual multispeaker expressive speech synthesis framework. It ca

Ajinkya Kulkarni 43 Nov 27, 2022
A deep learning framework for historical document image analysis

DIVA-DAF Description A deep learning framework for historical document image analysis. How to run Install dependencies # clone project git clone https

9 Aug 04, 2022
The official repository for "Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds"

Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds The why Im

3 Mar 29, 2022
基于PaddleClas实现垃圾分类,并转换为inference格式用PaddleHub服务端部署

百度网盘链接及提取码: 链接:https://pan.baidu.com/s/1HKpgakNx1hNlOuZJuW6T1w 提取码:wylx 一个垃圾分类项目带你玩转飞桨多个产品(1) 基于PaddleClas实现垃圾分类,导出inference模型并利用PaddleHub Serving进行服务

thomas-yanxin 22 Jul 12, 2022
GNN4Traffic - This is the repository for the collection of Graph Neural Network for Traffic Forecasting

GNN4Traffic - This is the repository for the collection of Graph Neural Network for Traffic Forecasting

564 Jan 02, 2023
Perfect implement. Model shared. x0.5 (Top1:60.646) and 1.0x (Top1:69.402).

Shufflenet-v2-Pytorch Introduction This is a Pytorch implementation of faceplusplus's ShuffleNet-v2. For details, please read the following papers:

423 Dec 07, 2022
tf2-keras implement yolov5

YOLOv5 in tesnorflow2.x-keras yolov5数据增强jupyter示例 Bilibili视频讲解地址: 《yolov5 解读,训练,复现》 Bilibili视频讲解PPT文件: yolov5_bilibili_talk_ppt.pdf Bilibili视频讲解PPT文件:

yangcheng 254 Jan 08, 2023
Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

JSON 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Json2Xml t

Nguyễn Trường Lâu 6 Aug 22, 2022
This program generates a random 12 digit/character password (upper and lowercase) and stores it in a file along with your username and app/website.

PasswordGeneratorAndVault This program generates a random 12 digit/character password (upper and lowercase) and stores it in a file along with your us

Chris 1 Feb 26, 2022
Code for "The Box Size Confidence Bias Harms Your Object Detector"

The Box Size Confidence Bias Harms Your Object Detector - Code Disclaimer: This repository is for research purposes only. It is designed to maintain r

Johannes G. 24 Dec 07, 2022
Trajectory Variational Autoencder baseline for Multi-Agent Behavior challenge 2022

MABe_2022_TVAE: a Trajectory Variational Autoencoder baseline for the 2022 Multi-Agent Behavior challenge This repository contains jupyter notebooks t

Andrew Ulmer 15 Nov 08, 2022
Doods2 - API for detecting objects in images and video streams using Tensorflow

DOODS2 - Return of DOODS Dedicated Open Object Detection Service - Yes, it's a b

Zach 101 Jan 04, 2023
A collection of 100 Deep Learning images and visualizations

A collection of Deep Learning images and visualizations. The project has been developed by the AI Summer team and currently contains almost 100 images.

AI Summer 65 Sep 12, 2022
Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset

SW-CV-ModelZoo Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset Framework: TF/Keras 2.7 Training SQLite D

20 Dec 27, 2022
🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained mo

Hugging Face 77.2k Jan 02, 2023
On-device speech-to-index engine powered by deep learning.

On-device speech-to-index engine powered by deep learning.

Picovoice 30 Nov 24, 2022
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

Microsoft 408 Dec 30, 2022
Python scripts for performing stereo depth estimation using the MobileStereoNet model in Tensorflow Lite.

TFLite-MobileStereoNet Python scripts for performing stereo depth estimation using the MobileStereoNet model in Tensorflow Lite. Stereo depth estimati

Ibai Gorordo 4 Feb 14, 2022