Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

Last update: Oct 20, 2022

Overview

TFLite-HITNET-Stereo-depth-estimation

Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

Stereo depth estimation on the cones images from the Middlebury dataset (https://vision.middlebury.edu/stereo/data/scenes2003/)

Requirements

OpenCV, imread-from-url and tensorflow==2.6.0 or tflite_runtime. Also, pafy and youtube-dl are required for youtube video inference.

Installation

pip install -r requirements.txt
pip install pafy youtube-dl

For the tflite runtime, you can either use tensorflow(make sure it is version 2.6.0 or above) pip install tensorflow==2.6.0 or the TensorFlow Runtime binary

Known issues

In computers with a GPU, the program would silently creash without any error during the inference, os.environ["CUDA_VISIBLE_DEVICES"]="-1" is added at the beginning of the script to force the program to run on the CPU. You can comment this line for other types of devices.

tflite model

The original models were converted to different formats (including .tflite) by PINTO0309, download the models from his repository and save them into the models folder.

Original Tensorflow model

The Tensorflow pretrained model was taken from the original repository.

Examples

Image inference:

python imageDepthEstimation.py

Video inference:

python videoDepthEstimation.py

DrivingStereo dataset inference:

python drivingStereoTest.py

Pytorch inference

For performing the inference in Tensorflow, check my other repository HITNET Stereo Depth estimation.

ONNX inference

For performing the inference in ONNX, check my other repository ONNX HITNET Stereo Depth estimation.

Inference video Example Raspberry Pi 4

References:

Hitnet model: https://github.com/google-research/google-research/tree/master/hitnet
PINTO0309's model zoo: https://github.com/PINTO0309/PINTO_model_zoo
PINTO0309's model conversion tool: https://github.com/PINTO0309/openvino2tensorflow
DrivingStereo dataset: https://drivingstereo-dataset.github.io/
Original paper: https://arxiv.org/abs/2007.12140

Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

Related tags

Overview

TFLite-HITNET-Stereo-depth-estimation

Requirements

Installation

Known issues

tflite model

Original Tensorflow model

Examples

Pytorch inference

ONNX inference

Inference video Example Raspberry Pi 4

References:

Owner

Ibai Gorordo

An example project demonstrating how the Autonomous Learning Library can be used to build new reinforcement learning agents.

An Open Source Machine Learning Framework for Everyone

Advanced yabai wooting scripts

Code used for the results in the paper "ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning"

Exploring Visual Engagement Signals for Representation Learning

Topic Modelling for Humans

A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021)

Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GanFormer and TransGan paper

official implemntation for "Contrastive Learning with Stronger Augmentations"

Download and preprocess popular sequential recommendation datasets

Official Codes for Graph Modularity:Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Neural Networks.

An NVDA add-on to split screen reader and audio from other programs to different sound channels

Code for paper: Towards Tokenized Human Dynamics Representation

PyTorch implementation of Trust Region Policy Optimization

[ICCV 2021] Deep Hough Voting for Robust Global Registration

Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.

Qlib is an AI-oriented quantitative investment platform

Implementation of ETSformer, state of the art time-series Transformer, in Pytorch

Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA results for single-image motion deblurring, image deraining, image denoising (synthetic and real data), and dual-pixel defocus deblurring.

Sentinel-1 vessel detection model used in the xView3 challenge