Here I will explain the flow to deploy your custom deep learning models on Ultra96V2.

Overview

Xilinx_Vitis_AI

This repo will help you to Deploy your Deep Learning Model on Ultra96v2 Board.


Prerequisites

  1. Vitis Core Development Kit 2019.2

This could be downloaded from here: Link to the websire

  1. Vitis-AI GitHub Repository v1.1

Here is the link to the repository v1.1

  1. Vitis-Ai Docker Container

The command to pull the container: docker pull xilinx/vitis-ai:1.1.56

  1. XRT 2019.2

GitHub Repo Link 2019.2

  1. Avnet Vitis Platform 2019.2

Here is the link to download the zip file Avnet Website

  1. Ubuntu OS 18.04

Once the tools have been setup, there are five (5) main steps to targeting an AI applications to Ultra96V2 Platform:

  1. Build the Hardware Design
  2. Compile Your Custom Model
  3. Build the AI Applications
  4. Create the SD Card Content
  5. Execute the AI Applications on hardware

Supposed that you have trained your model previously in one of the Tensorflow (.Pb), Caffe(.Caffemodel and .Prototxt) and Darknet(.Weights and .Cfg) Frameworks.

Build the Hardware Design

Clone Xilinx’s Vitis-AI github repository:

$ git clone --branch v1.1 https://github.com/Xilinx/Vitis-AI
$ cd Vitis-AI
$ export VITIS_AI_HOME = "$PWD"

Install the Avnet Vitis platform:>

Download this and extract to the hard drive of your linux machine. Then, specify the location of the Vitis platform, by creating the SDX_PLATFORM environment variable that specified to the location of the.xpfm file.

$ export SDX_PLATFORM=/home/Avnet/vitis/platform_repo/ULTRA96V2/ULTRA96V2.xpfm

Build the Hardware Project (SD Card Image)

I suggest you to download the Pre-Built from here

Compile the Trained Models

Remember that you should have pulled the docker container first.

Caffe Models:

$ cd $VITIS_AI_HOME
$ mkdir project
$ cp PATH/TO/TRAINED/MODELS  $VITIS_AI_HOME/project
$ ./docker_run.sh xilinx/vitis-ai:1.1.56
$ cd project
$ conda activate vitis-ai-caffe
$ vai_q_caffe quantize -model float.prototxt -weights float.caffemodel -calib_iter 5
$ vai_c_caffe -p .PROTOTXT -c .CAFFEMODEL -a ARCH.JSON -o OUTPUT_DIR -n NET_NAME 

Tensorflow Models:

$ cd $VITIS_AI_HOME
$ mkdir project
$ cp PATH/TO/TRAINED/MODELS  $VITIS_AI_HOME/project
$ ./docker_run.sh xilinx/vitis-ai:1.1.56
$ cd project
$ conda activate vitis-ai-tensorflow
$ vai_q_tensorflow quantize --input_frozen_graph FROZEN_PB --input_nodes xxx --output_nodes yyy --input_shapes zzz --input_fn module.calib_input --calib_iter 5
$ vai_c_tensorflow -f FROZEN_PB -a ARCH.JSON -o OUTPUT_DIR -n NET_NAME 

Compile the AI Application Using DNNDK APIs

The DNNDK API is the low-level API used to communicate with the AI engine (DPU). This API is the recommended API for users that will be creating their own custom neural networks.

Download and install the SDK for cross-compilation, specifying a unique and meaningful installation destination (knowing that this SDK will be specific to the Vitis-AI 1.1 DNNDK samples):

$ wget -O sdk.sh https://www.xilinx.com/bin/public/openDownload?filename=sdk.sh
$ chmod +x sdk.sh
$ ./sdk.sh -d ~/petalinux_sdk_vai_1_1_dnndk 

Setup the environment for cross-compilation:

$ unset LD_LIBRARY_PATH
$ source ~/petalinux_sdk_vai_1_1_dnndk/environment-setup-aarch64-xilinx-linux

Download and extract the DNNDK runtime examples and Install the additional DNNDK runtime content:

$ wget -O vitis-ai_v1.1_dnndk.tar.gz  https://www.xilinx.com/bin/public/openDownload?filename=vitis-ai_v1.1_dnndk.tar.gz
$ tar -xvzf vitis-ai-v1.1_dnndk.tar.gz
$ cd vitis-ai-v1.1_dnndk
$ ./install.sh $SDKTARGETSYSROOT

Copy the Compiled project:

$ cp -r ../project/ .

Download and extract the additional content (images and video files) for the DNNDK examples:

$ wget -O vitis-ai_v1.1_dnndk_sample_img.tar.gz https://www.xilinx.com/bin/public/openDownload?filename=vitis-ai_v1.1_dnndk_sample_img.tar.gz
$ tar -xvzf vitis-ai_v1.1_dnndk_sample_img.tar.gz

For the custom application (project folder), create a model directory and copy the dpu_*.elf model files you previously built:

$ cd $VITIS_AI_HOME/project
$ mkdir model_for_ultra96v2
$ cp -r model_for_ultra96v2 model
$ make

NOTE: You could also edit the build.sh script to add support for the new Platforms like Ultra96V2.

Execute the AI Application on ULTRA96V2

  1. Boot the Ultra96V2 with the pre-build sd-card image you dowloaded. For Learning How to Do This, Click HERE!
  2. $ cd /run/media/mmcblk0p1
  3. $ cp dpu.xclbin /usr/lib/.
  4. Install the Vitis-AI embedded package:
$ cd runtime/vitis-ai_v1.1_dnndk 
$ source ./install.sh
  1. Define the DISPLAY environment variable:
$ export DISPLAY=:0.0
$ xrandr --output DP-1 --mode 640x480
  1. Run the Custom Application:
 $ cd vitis_ai_dnndk_samples
 $ ./App 
Owner
Amin Mamandipoor
Currently, Studying Master of Computer Systems Architecture at the University of Tabriz.
Amin Mamandipoor
This is a custom made virus code in python, using tkinter module.

skeleterrorBetaV0.1-Virus-code This is a custom made virus code in python, using tkinter module. This virus is not harmful to the computer, it only ma

AR 0 Nov 21, 2022
Code for Contrastive-Geometry Networks for Generalized 3D Pose Transfer

CGTransformer Code for our AAAI 2022 paper "Contrastive-Geometry Transformer network for Generalized 3D Pose Transfer" Contrastive-Geometry Transforme

18 Jun 28, 2022
[WACV 2022] Contextual Gradient Scaling for Few-Shot Learning

CxGrad - Official PyTorch Implementation Contextual Gradient Scaling for Few-Shot Learning Sanghyuk Lee, Seunghyun Lee, and Byung Cheol Song In WACV 2

Sanghyuk Lee 4 Dec 05, 2022
Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

flownet2-pytorch Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. Multiple GPU training is supported, a

NVIDIA Corporation 2.8k Dec 27, 2022
This code finds bounding box of a single human mouth.

This code finds bounding box of a single human mouth. In comparison to other face segmentation methods, it is relatively insusceptible to open mouth conditions, e.g., yawning, surgical robots, etc. T

iThermAI 4 Nov 27, 2022
(3DV 2021 Oral) Filtering by Cluster Consistency for Large-Scale Multi-Image Matching

Scalable Cluster-Consistency Statistics for Robust Multi-Object Matching (3DV 2021 Oral Presentation) Filtering by Cluster Consistency (FCC) is a very

Yunpeng Shi 11 Sep 28, 2022
Hcpy - Interface with Home Connect appliances in Python

Interface with Home Connect appliances in Python This is a very, very beta inter

Trammell Hudson 116 Dec 27, 2022
Flexible time series feature extraction & processing

tsflex is a toolkit for flexible time series processing & feature extraction, that is efficient and makes few assumptions about sequence data. Useful

PreDiCT.IDLab 206 Dec 28, 2022
Prompt-BERT: Prompt makes BERT Better at Sentence Embeddings

Prompt-BERT: Prompt makes BERT Better at Sentence Embeddings Results on STS Tasks Model STS12 STS13 STS14 STS15 STS16 STSb SICK-R Avg. unsup-prompt-be

196 Jan 08, 2023
Text Summarization - WCN — Weighted Contextual N-gram method for evaluation of Text Summarization

Text Summarization WCN — Weighted Contextual N-gram method for evaluation of Text Summarization In this project, I fine tune T5 model on Extreme Summa

Aditya Shah 1 Jan 03, 2022
Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].

OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data Christoph Reich, Tim Prangemeier, Özdemir Cetin & Heinz Koeppl | Pr

Christoph Reich 23 Sep 21, 2022
This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX

The goal of Project CodeNet is to provide the AI-for-Code research community with a large scale, diverse, and high quality curated dataset to drive innovation in AI techniques.

International Business Machines 1.2k Jan 04, 2023
Official code for article "Expression is enough: Improving traffic signal control with advanced traffic state representation"

1 Introduction Official code for article "Expression is enough: Improving traffic signal control with advanced traffic state representation". The code s

Liang Zhang 10 Dec 10, 2022
CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper) (Accepted for oral presentation at ACM

Minha Kim 1 Nov 12, 2021
Galaxy images labelled by morphology (shape). Aimed at ML development and teaching

Galaxy images labelled by morphology (shape). Aimed at ML debugging and teaching.

Mike Walmsley 14 Nov 28, 2022
Official implementation of the MM'21 paper Constrained Graphic Layout Generation via Latent Optimization

[MM'21] Constrained Graphic Layout Generation via Latent Optimization This repository provides the official code for the paper "Constrained Graphic La

Kotaro Kikuchi 73 Dec 27, 2022
A dataset for online Arabic calligraphy

Calliar Calliar is a dataset for Arabic calligraphy. The dataset consists of 2500 json files that contain strokes manually annotated for Arabic callig

ARBML 114 Dec 28, 2022
Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019) Introduction Official implementation of Adaptive Pyramid Context Network

21 Nov 09, 2022
Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)

N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Gra

32 Dec 26, 2022
Distributed DataLoader For Pytorch Based On Ray

Dpex——用户无感知分布式数据预处理组件 一、前言 随着GPU与CPU的算力差距越来越大以及模型训练时的预处理Pipeline变得越来越复杂,CPU部分的数据预处理已经逐渐成为了模型训练的瓶颈所在,这导致单机的GPU配置的提升并不能带来期望的线性加速。预处理性能瓶颈的本质在于每个GPU能够使用的C

Dalong 23 Nov 02, 2022