Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

Overview

M4Depth

This is the reference TensorFlow implementation for training and testing depth estimation models using the method described in

M4Depth: A motion-based approach for monocular depth estimation on video sequences

Michaël Fonder, Damien Ernst and Marc Van Droogenbroeck

arXiv pdf

1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1

Some samples produced by our method: the first line shows the RGB picture capured by the camera, the second the ground-truth depth map and the last one the results produced by our method.

If you find our work useful in your research please consider citing our paper:

@article{Fonder2021M4Depth,
  title     = {M4Depth: A motion-based approach for monocular depth estimation on video sequences},
  author    = {Michael Fonder and Damien Ernst and Marc Van Droogenbroeck},
  booktitle = {arXiv},
  month     = {May},
  year      = {2021}
}

If you use the Mid-Air dataset in your research, please consider citing the related paper:

@INPROCEEDINGS{Fonder2019MidAir,
  author    = {Michael Fonder and Marc Van Droogenbroeck},
  title     = {Mid-Air: A multi-modal dataset for extremely low altitude drone flights},
  booktitle = {Conference on Computer Vision and Pattern Recognition Workshop (CVPRW)},
  year      = {2019},
  month     = {June}
} 

Dependencies

Assuming a fresh Anaconda distribution, you can install the dependencies with:

conda install tensorflow-gpu=1.15 h5py pyquaternion numpy 

Formatting data

Our code works with tensorflow protobuffer files data for training and testing therefore need to be encoded properly before being passed to the network.

Mid-Air dataset

To reproduce the results of our paper, you can use the Mid-Air dataset for training and testing our network. For this, you will first need to download the required data on your computer. The procedure to get them is the following:

  1. Go on the download page of the Mid-Air dataset
  2. Select the "Left RGB" and "Stereo Disparity" image types
  3. Move to the end of the page and enter your email to get the download links (the volume of selected data should be equal to 316.5Go)
  4. Follow the procedure given at the begining of the download page to download and extract the dataset

Once the dataset is downloaded you can generate the required protobuffer files by running the following script:

python3 midair-protobuf_generation.py --db_path path/to/midair-root --output_dir desired/protobuf-location --write

This script generates trajectory sequences with a length of 8 frames and automatically creates the train and test splits for Mid-Air in separated subdirectories.

Custom data

You can also train or test our newtork on your own data. You can generate your own protobuffer files by repurpusing our midair-protobuf_generation.py script. When creating your own protobuffer files, you should pay attention to two major parameters; All sequences should have the same length and each element of a sequence should come with the following data:

  • "image/color_i" : the binary data of the jpeg picture encoding the color data of the frame
  • "Image/depth_i" : the binary data of the 16-bit png file encoding the stereo disparity map
  • "data/omega_i" : a list of three float32 numbers corresponding to the angular rotation between two consecutive frames
  • "data/trans_i" : a list of three float32 numbers corresponding to the translation between two consecutive frames

The subscript i has to be replaced by the index of the data within the trajectory. Translations and rotations are expressed in the standard camera frame of refence axis system.

Training

You can launch a training or a finetuning (if the log_dir already exists) by exectuting the following command line:

python3 m4depth_pipeline.py --train_datadir=path/to/protobuf/dir --log_dir=path/to/logdir --dataset=midair --arch_depth=6 --db_seq_len=8 --seq_len=6 --num_batches=200000 -b=3 -g=1 --summary_interval_secs=900 --save_interval_secs=1800

If needed, other options are available for the training phase and are described in pipeline_options.py and in m4depth_options.py files. Please note that the code can run on multiple GPUs to speedup the training.

Testing/Evaluation

You can launch the evaluation of your test samples by exectuting the following command line:

python3 m4depth_pipeline.py --test_datadir=path/to/protobuf/dir --log_dir=path/to/logdir --dataset=midair --arch_depth=6 --db_seq_len=8 --seq_len=8 --b=3 -g=1

If needed, other options are available for the evaluation phase and are described in pipeline_options.py and in m4depth_options.py files.

Pretrained model

We provide pretrained weights for our model in the "trained_weights" directory. Testing or evaluating a dataset from these weight can be done by executing the following command line:

python3 m4depth_pipeline.py --test_datadir=path/to/protobuf/dir --log_dir=trained_weights/M4Depth-d6 --dataset=midair --arch_depth=6 --db_seq_len=8 --seq_len=8 --b=3 -g=1
Owner
Michaël Fonder
PhD candidate in computer vision and deep learning. Interested in drone flight automation by using an on-board mounted monocular camera.
Michaël Fonder
FewBit — a library for memory efficient training of large neural networks

FewBit FewBit — a library for memory efficient training of large neural networks. Its efficiency originates from storage optimizations applied to back

24 Oct 22, 2022
Spectrum is an AI that uses machine learning to generate Rap song lyrics

Spectrum Spectrum is an AI that uses deep learning to generate rap song lyrics. View Demo Report Bug Request Feature Open In Colab About The Project S

39 Dec 16, 2022
Zero-shot Synthesis with Group-Supervised Learning (ICLR 2021 paper)

GSL - Zero-shot Synthesis with Group-Supervised Learning Figure: Zero-shot synthesis performance of our method with different dataset (iLab-20M, RaFD,

Andy_Ge 62 Dec 21, 2022
The project page of paper: Architecture disentanglement for deep neural networks [ICCV 2021, oral]

This is the project page for the paper: Architecture Disentanglement for Deep Neural Networks, Jie Hu, Liujuan Cao, Tong Tong, Ye Qixiang, ShengChuan

Jie Hu 15 Aug 30, 2022
small collection of functions for neural networks

neurobiba other languages: RU small collection of functions for neural networks. very easy to use! Installation: pip install neurobiba See examples h

4 Aug 23, 2021
MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks

MEAL-V2 This is the official pytorch implementation of our paper: "MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tric

Zhiqiang Shen 653 Dec 19, 2022
Ranger deep learning optimizer rewrite to use newest components

Ranger21 - integrating the latest deep learning components into a single optimizer Ranger deep learning optimizer rewrite to use newest components Ran

Less Wright 266 Dec 28, 2022
Attention for PyTorch with Linear Memory Footprint

Attention for PyTorch with Linear Memory Footprint Unofficially implements https://arxiv.org/abs/2112.05682 to get Linear Memory Cost on Attention (+

11 Jan 09, 2022
GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)

GemNet: Universal Directional Graph Neural Networks for Molecules Reference implementation in PyTorch of the geometric message passing neural network

Data Analytics and Machine Learning Group 124 Dec 30, 2022
Final project code: Implementing BicycleGAN, for CIS680 FA21 at University of Pennsylvania

680 Final Project: BicycleGAN Haoran Tang Instructions 1. Training To train the network, please run train.py. Change hyper-parameters and folder paths

Haoran Tang 0 Apr 22, 2022
PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages

PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages Abstract NLP applications for code-mixed (CM) or mix-li

Mohsin Ali, Mohammed 1 Nov 12, 2021
Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations

Implementation for Iso-Points (CVPR 2021) Official code for paper Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations paper |

Yifan Wang 66 Nov 08, 2022
python library for invisible image watermark (blind image watermark)

invisible-watermark invisible-watermark is a python library and command line tool for creating invisible watermark over image.(aka. blink image waterm

Shield Mountain 572 Jan 07, 2023
Cervix ROI Segmentation Using U-NET

Cervix ROI Segmentation Using U-NET Overview This code illustrate how to segment the ROI in cervical images using U-NET. The ROI here meant to include

Scotty Kwok 35 Sep 14, 2022
PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Background Activation Suppression for Weakly Supervised Object Localization PyTorch implementation of ''Background Activation Suppression for Weakly S

35 Jan 06, 2023
A really easy-to-use and powerful sudoku solver.

SodukuSolver This is a really useful sudoku solver with a Qt gui. USAGE Enter the numbers in and click "RUN"! If you don't want to wait, simply press

Ujhhgtg Teams 11 Jun 02, 2022
3D AffordanceNet is a 3D point cloud benchmark consisting of 23k shapes from 23 semantic object categories, annotated with 56k affordance annotations and covering 18 visual affordance categories.

3D AffordanceNet This repository is the official experiment implementation of 3D AffordanceNet benchmark. 3D AffordanceNet is a 3D point cloud benchma

49 Dec 01, 2022
In real-world applications of machine learning, reliable and safe systems must consider measures of performance beyond standard test set accuracy

PixMix Introduction In real-world applications of machine learning, reliable and safe systems must consider measures of performance beyond standard te

Andy Zou 79 Dec 30, 2022
Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds (Local-Lip)

Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds (Local-Lip) Introduction TL;DR: We propose an efficient and trainabl

17 Dec 01, 2022
Behavioral "black-box" testing for recommender systems

RecList RecList Free software: MIT license Documentation: https://reclist.readthedocs.io. Overview RecList is an open source library providing behavio

Jacopo Tagliabue 375 Dec 30, 2022