Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Last update: Jan 02, 2023

Related tags

Overview

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

This repository contains an implementation of our CVPR2021 publication:

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging. S. Mahdi H. Miangoleh, Sebastian Dille, Long Mai, Sylvain Paris, Yağız Aksoy. Main pdf, Supplementary pdf, Project Page.

Change log:

Google Colaboratory notebook is now available [June 2021]
Merge net training dataset generation instructions is now available. [June 2021]
bug fix [June 2021]

Setup

We Provided the implementation of our method using MiDas-v2 and SGRnet as the base.

Environments

Our mergenet model is trained using torch 0.4.1 and python 3.6 and is tested with torch<=1.8.

Download our mergenet model weights from here and put it in

.\pix2pix\checkpoints\mergemodel\latest_net_G.pth

To use MiDas-v2 as base: Install dependancies as following:

conda install pytorch torchvision opencv cudatoolkit=10.2 -c pytorch
conda install matplotlib
conda install scipy
conda install scikit-image

Download the model weights from MiDas-v2 and put it in

./midas/model.pt

activate the environment
python run.py --Final --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet 0

To use SGRnet as base: Install dependancies as following:

conda install pytorch=0.4.1 cuda92 -c pytorch
conda install torchvision
conda install matplotlib
conda install scikit-image
pip install opencv-python

Follow the official SGRnet repository to compile the syncbn module in ./structuredrl/models/syncbn. Download the model weights from SGRnet and put it in

./structuredrl/model.pth.tar

activate the environment
python run.py --Final --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet 1

Different input arguments can be used to generate R0 and R20 results as discussed in the paper.

python run.py --R0 --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet #[0or1]
python run.py --R20 --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet #[0or1]

Evaluation

Fill in the needed variables in the following matlab file and run:

./evaluation/evaluatedataset.m

estimation_path : path to estimated disparity maps
gt_depth_path : path to gt depth/disparity maps
dataset_disp_gttype : (true) if ground truth data is disparity and (false) if gt depth data is depth.
evaluation_matfile_save_dir : directory to save the evalution results as .mat file.
superpixel_scale : scale parameter to run the superpixels on scaled version of the ground truth images to accelarate the evaluation. use 1 for small gt images.

Training

Navigate to dataset preparation instructions to download and prepare the training dataset.

python ./pix2pix/train.py --dataroot DATASETDIR --name mergemodeltrain --model pix2pix4depth --no_flip --no_dropout

python ./pix2pix/test.py --dataroot DATASETDIR --name mergemodeleval --model pix2pix4depth --no_flip --no_dropout

Citation

This implementation is provided for academic use only. Please cite our paper if you use this code or any of the models.

@INPROCEEDINGS{Miangoleh2021Boosting,
author={S. Mahdi H. Miangoleh and Sebastian Dille and Long Mai and Sylvain Paris and Ya\u{g}{\i}z Aksoy},
title={Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging},
journal={Proc. CVPR},
year={2021},
}

Credits

The "Merge model" code skeleton (./pix2pix folder) was adapted from the pytorch-CycleGAN-and-pix2pix repository.

For MiDaS and SGR inferences we used the scripts and models from MiDas-v2 and SGRnet respectively (./midas and ./structuredrl folders).

Thanks to k-washi for providing us with a Google Colaboratory notebook implementation.

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Related tags

Overview

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Change log:

Setup

Environments

Evaluation

Training

Citation

Credits

Owner

Computational Photography Lab @ SFU

Pytorch implementation for RelTransformer

Hand Gesture Volume Control | Open CV | Computer Vision

CVPR '21: In the light of feature distributions: Moment matching for Neural Style Transfer

ScaleNet: A Shallow Architecture for Scale Estimation

All the essential resources and template code needed to understand and practice data structures and algorithms in python with few small projects to demonstrate their practical application.

An Open Source Machine Learning Framework for Everyone

A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

UCSD Oasis platform

Wind Speed Prediction using LSTMs in PyTorch

Interactive Terraform visualization. State and configuration explorer.

Build fully-functioning computer vision models with PyTorch

TensorFlow (Python API) implementation of Neural Style

Pytorch implementation of YOLOX、PPYOLO、PPYOLOv2、FCOS an so on.

Energy consumption estimation utilities for Jetson-based platforms

Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

Official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

Pre-Training 3D Point Cloud Transformers with Masked Point Modeling

Official pytorch implementation of paper Dual-Level Collaborative Transformer for Image Captioning (AAAI 2021).

[NeurIPS 2021 Spotlight] Code for Learning to Compose Visual Relations

LightSeq is a high performance training and inference library for sequence processing and generation implemented in CUDA