This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider

Overview

SBEVNet: End-to-End Deep Stereo Layout Estimation

This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider

Usage

Dependencies

pip install --upgrade git+https://github.com/divamgupta/pytorch-propane
pip install torch==1.4.0 torchvision==0.5.0
pip install opencv-python
pip install torchgeometry

Dataset and Directories

For the example we use the following directories:

  • Datasets : ./datasets/carla/ and ./datasets/kitti/
  • Weights : ./sbevnet_weights/carla and ./sbevnet_weights/kitti
  • Predictions : ./predictions/kitti ./predictions/carla

Download and unzip the datasets and place them in ./datasets directory

Training

cd <cloned_repo_path>

Training the model on the CARLA dataset:

pytorch_propane sbevnet train    \
 --model_name sbevnet_model --network_name sbevnet --dataset_name  sbevnet_dataset_main --dataset_split train \
 --eval_dataset_name "sbevnet_dataset_main" --eval_dataset_split test \
 --batch_size 3  --eval_batch_size 1 \
 --n_epochs 20   --overwrite_epochs true  \
 --datapath "datasets/carla/dataset.json" \
 --save_path "sbevnet_weights/carla/carla_save_0" \
 --image_w 512 \
 --image_h 288 \
 --max_disp 64 \
 --n_hmap 100 \
 --xmin 1 \
 --xmax 39 \
 --ymin -19 \
 --ymax 19 \
 --cx 256 \
 --cy 144 \
 --f 179.2531 \
 --tx 0.2 \
 --camera_ext_x 0.9 \
 --camera_ext_y -0.1 \
 --fixed_cam_confs true \
 --do_ipm_rgb true \
 --do_ipm_feats true  \
 --do_mask true --check_degenerate true 

Training the model on the KITTI dataset:

pytorch_propane sbevnet train    \
 --model_name sbevnet_model --network_name sbevnet --dataset_name  sbevnet_dataset_main --dataset_split train \
 --eval_dataset_name "sbevnet_dataset_main" --eval_dataset_split test \
 --batch_size 3  --eval_batch_size 1 \
 --n_epochs 40   --overwrite_epochs true  \
 --datapath "datasets/kitti/dataset.json" \
 --save_path "sbevnet_weights/kitti/kitti_save_0" \
 --image_w 640 \
 --image_h 256 \
 --max_disp 64 \
 --n_hmap 128 \
 --xmin 5.72 \
 --xmax 43.73 \
 --ymin -19 \
 --ymax 19 \
 --camera_ext_x 0 \
 --camera_ext_y 0 \
 --fixed_cam_confs false \
 --do_ipm_rgb true \
 --do_ipm_feats true  \
 --do_mask true --check_degenerate true 

Evaluation

Evaluating the model on the CARLA dataset:

pytorch_propane sbevnet eval_iou    \
 --model_name sbevnet_model --network_name sbevnet \
 --eval_dataset_name "sbevnet_dataset_main" --eval_dataset_split test --dataset_type carla \
 --eval_batch_size 1 \
 --datapath "datasets/carla/dataset.json" \
 --load_checkpoint_path "sbevnet_weights/carla/carla_save_0" \
 --image_w 512 \
 --image_h 288 \
 --max_disp 64 \
 --n_hmap 100 \
 --xmin 1 \
 --xmax 39 \
 --ymin -19 \
 --ymax 19 \
 --cx 256 \
 --cy 144 \
 --f 179.2531 \
 --tx 0.2 \
 --camera_ext_x 0.9 \
 --camera_ext_y -0.1 \
 --fixed_cam_confs true \
 --do_ipm_rgb true \
 --do_ipm_feats true  \
 --do_mask true 

Evaluating the model on the KITTI dataset:

pytorch_propane sbevnet eval_iou    \
 --model_name sbevnet_model --network_name sbevnet  \
 --eval_dataset_name "sbevnet_dataset_main" --eval_dataset_split test --dataset_type kitti \
 --eval_batch_size 1 \
 --datapath "datasets/kitti/dataset.json" \
 --load_checkpoint_path "sbevnet_weights/kitti/kitti_save_0" \
 --image_w 640 \
 --image_h 256 \
 --max_disp 64 \
 --n_hmap 128 \
 --xmin 5.72 \
 --xmax 43.73 \
 --ymin -19 \
 --ymax 19 \
 --camera_ext_x 0 \
 --camera_ext_y 0 \
 --fixed_cam_confs false \
 --do_ipm_rgb true \
 --do_ipm_feats true  \
 --do_mask true 

Save Predictions

Save predictions of the model on the CARLA dataset:

pytorch_propane sbevnet save_preds    \
 --model_name sbevnet_model --network_name sbevnet \
 --eval_dataset_name "sbevnet_dataset_main" --eval_dataset_split test --output_dir "predictions/kitti" \
 --eval_batch_size 1 \
 --datapath "datasets/carla/dataset.json" \
 --load_checkpoint_path "sbevnet_weights/carla/carla_save_0" \
 --image_w 512 \
 --image_h 288 \
 --max_disp 64 \
 --n_hmap 100 \
 --xmin 1 \
 --xmax 39 \
 --ymin -19 \
 --ymax 19 \
 --cx 256 \
 --cy 144 \
 --f 179.2531 \
 --tx 0.2 \
 --camera_ext_x 0.9 \
 --camera_ext_y -0.1 \
 --fixed_cam_confs true \
 --do_ipm_rgb true \
 --do_ipm_feats true  \
 --do_mask true 

Save predictions of the model on the KITTI dataset:

pytorch_propane sbevnet save_preds    \
 --model_name sbevnet_model --network_name sbevnet  \
 --eval_dataset_name "sbevnet_dataset_main" --eval_dataset_split test --output_dir "predictions/kitti" \
 --eval_batch_size 1 \
 --datapath "datasets/kitti/dataset.json" \
 --load_checkpoint_path "sbevnet_weights/kitti/kitti_save_0" \
 --image_w 640 \
 --image_h 256 \
 --max_disp 64 \
 --n_hmap 128 \
 --xmin 5.72 \
 --xmax 43.73 \
 --ymin -19 \
 --ymax 19 \
 --camera_ext_x 0 \
 --camera_ext_y 0 \
 --fixed_cam_confs false \
 --do_ipm_rgb true \
 --do_ipm_feats true  \
 --do_mask true 
Owner
Divam Gupta
Graduate student at Carnegie Mellon University | Former Research Fellow at Microsoft Research
Divam Gupta
Codes for 'Dual Parameterization of Sparse Variational Gaussian Processes'

Dual Parameterization of Sparse Variational Gaussian Processes Documentation | Notebooks | API reference Introduction This repository is the official

AaltoML 7 Dec 23, 2022
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran

Swin Transformer 1.4k Dec 30, 2022
A simple, fully convolutional model for real-time instance segmentation.

You Only Look At CoefficienTs ██╗ ██╗ ██████╗ ██╗ █████╗ ██████╗████████╗ ╚██╗ ██╔╝██╔═══██╗██║ ██╔══██╗██╔════╝╚══██╔══╝ ╚██

Daniel Bolya 4.6k Dec 30, 2022
TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".

ICNet_tensorflow This repo provides a TensorFlow-based implementation of paper "ICNet for Real-Time Semantic Segmentation on High-Resolution Images,"

HsuanKung Yang 406 Nov 27, 2022
OntoProtein: Protein Pretraining With Ontology Embedding

OntoProtein This is the implement of the paper "OntoProtein: Protein Pretraining With Ontology Embedding". OntoProtein is an effective method that mak

ZJUNLP 80 Dec 14, 2022
Collection of common code that's shared among different research projects in FAIR computer vision team.

fvcore fvcore is a light-weight core library that provides the most common and essential functionality shared in various computer vision frameworks de

Meta Research 1.5k Jan 07, 2023
Pytorch library for end-to-end transformer models training and serving

Pytorch library for end-to-end transformer models training and serving

Mikhail Grankin 768 Jan 01, 2023
An open source implementation of CLIP.

OpenCLIP Welcome to an open source implementation of OpenAI's CLIP (Contrastive Language-Image Pre-training). The goal of this repository is to enable

2.7k Dec 31, 2022
ICLR 2021: Pre-Training for Context Representation in Conversational Semantic Parsing

SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing This repository contains code for the ICLR 2021 paper "SCoRE: Pre-Tr

Microsoft 28 Oct 02, 2022
Implementation of "Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner"

Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner This repository is the official implementation of Meta-rPPG: Remote Heart Ra

Eugene Lee 137 Dec 13, 2022
IsoGCN code for ICLR2021

IsoGCN The official implementation of IsoGCN, presented in the ICLR2021 paper Isometric Transformation Invariant and Equivariant Graph Convolutional N

horiem 39 Nov 25, 2022
Open-AI's DALL-E for large scale training in mesh-tensorflow.

DALL-E in Mesh-Tensorflow [WIP] Open-AI's DALL-E in Mesh-Tensorflow. If this is similarly efficient to GPT-Neo, this repo should be able to train mode

EleutherAI 432 Dec 16, 2022
SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts

[arXiv] The main motivation of the SHIFT15M project is to provide a dataset that contains natural dataset shifts collected from a web service IQON, wh

ZOZO, Inc. 138 Nov 24, 2022
This is an official implementation for "ResT: An Efficient Transformer for Visual Recognition".

ResT By Qing-Long Zhang and Yu-Bin Yang [State Key Laboratory for Novel Software Technology at Nanjing University] This repo is the official implement

zhql 222 Dec 13, 2022
Julia package for contraction of tensor networks, based on the sweep line algorithm outlined in the paper General tensor network decoding of 2D Pauli codes

Julia package for contraction of tensor networks, based on the sweep line algorithm outlined in the paper General tensor network decoding of 2D Pauli codes

Christopher T. Chubb 35 Dec 21, 2022
iNAS: Integral NAS for Device-Aware Salient Object Detection

iNAS: Integral NAS for Device-Aware Salient Object Detection Introduction Integral search design (jointly consider backbone/head structures, design/de

顾宇超 77 Dec 02, 2022
This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

LIBRAS-Image-Classifier This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian

Aryclenio Xavier Barros 26 Oct 14, 2022
Jax/Flax implementation of Variational-DiffWave.

jax-variational-diffwave Jax/Flax implementation of Variational-DiffWave. (Zhifeng Kong et al., 2020, Diederik P. Kingma et al., 2021.) DiffWave with

YoungJoong Kim 37 Dec 16, 2022
Unofficial PyTorch implementation of MobileViT.

MobileViT Overview This is a PyTorch implementation of MobileViT specified in "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Tr

Chin-Hsuan Wu 348 Dec 23, 2022
Official repository of the AAAI'2022 paper "Contrast and Generation Make BART a Good Dialogue Emotion Recognizer"

CoG-BART Contrast and Generation Make BART a Good Dialogue Emotion Recognizer Quick Start: To run the model on test sets of four datasets, Download th

39 Dec 24, 2022