Code for "Learning to Segment Rigid Motions from Two Frames".

Overview

rigidmask

Code for "Learning to Segment Rigid Motions from Two Frames".

** This is a partial release with inference and evaluation code. The project is still being tested and documented. There might be implemention changes in the future release. Thanks for your interest.

Visuals on Sintel/KITTI/Coral (not temporally smoothed):

If you find this work useful, please consider citing:

@article{yang2021rigidmask,
  title={Learning to Segment Rigid Motions from Two Frames},
  author={Yang, Gengshan and Ramanan, Deva},
  journal={arXiv preprint arXiv:2101.03694},
  year={2021}
}

Data and precomputed results

Download

Additional inputs (coral reef images) and precomputed results are hosted on google drive. Run (assuming you have installed gdown)

gdown https://drive.google.com/uc?id=1Up2cPCjzd_HGafw1AB2ijGmiKqaX5KTi -O ./input.tar.gz
gdown https://drive.google.com/uc?id=12C7rl5xS66NpmvtTfikr_2HWL5SakLVY -O ./rigidmask-sf-precomputed.zip
tar -xzvf ./input.tar.gz 
unzip ./rigidmask-sf-precomputed.zip -d precomputed/

To compute the results in Tab.1, Tab.2 on KITTI,

modelname=rigidmask-sf
python eval/eval_seg.py  --path precomputed/$modelname/  --dataset 2015
python eval/eval_sf.py   --path precomputed/$modelname/  --dataset 2015

Install

The code is tested with python 3.8, pytorch 1.7.0, and CUDA 10.2. Install dependencies by

conda env create -f rigidmask.yml
conda activate rigidmask_v0
pip install kornia
python -m pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.7/index.html

Compile DCNv2 and ngransac.

cd models/networks/DCNv2/; python setup.py install; cd -
cd models/ngransac/; python setup.py install; cd -

Pretrained models

Download pre-trained models to ./weights (assuming gdown is installed),

mkdir weights
mkdir weights/rigidmask-sf
mkdir weights/rigidmask-kitti
gdown https://drive.google.com/uc?id=1H2khr5nI4BrcrYMBZVxXjRBQYBcgSOkh -O ./weights/rigidmask-sf/weights.pth
gdown https://drive.google.com/uc?id=1sbu6zVeiiK1Ra1vp_ioyy1GCv_Om_WqY -O ./weights/rigidmask-kitti/weights.pth
modelname training set flow model flow err. (K:Fl-err/EPE) motion-in-depth err. (K:1e4) seg. acc. (K:obj/K:bg/S:bg)
rigidmask-sf (mono) SF C+SF+V 10.9%/3.128px 120.4 90.71%/97.05%/86.72%
rigidmask-kitti (stereo) SF+KITTI C+SF+V->KITTI 4.1%/1.155px 49.7 95.58%/98.91%/-

** C: FlythingChairs, SF(SceneFlow including FlyingThings, Monkaa, and Driving, K: KITTI scene flow training set, V: VIPER, S: Sintel.

Inference

Run and visualize rigid segmentation of coral reef video, (pass --refine to turn on rigid motion refinement). Results will be saved at ./weights/$modelname/seq/ and a output-seg.gif file will be generated in the current folder.

modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset seq-coral --datapath input/imgs/coral/   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth --testres 1
python eval/generate_visual.py --datapath weights/$modelname/seq-coral/ --imgpath input/imgs/coral

Run and visualize two-view depth estimation on kitti video, a output-depth.gif will be saved to the current folder.

modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset seq-kitti --datapath input/imgs/kitti_2011_09_30_drive_0028_sync_11xx/   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth --testres 1.2 --refine
python eval/generate_visual.py --datapath weights/$modelname/seq-kitti/ --imgpath input/imgs/kitti_2011_09_30_drive_0028_sync_11xx
python eval/render_scene.py --inpath weights/rigidmask-sf/seq-kitti/pc0-0000001110.ply

Run and evaluate kitti-sceneflow (monocular setup, Tab. 1 and Tab. 2),

modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015 --datapath path-to-kitti-sceneflow-training   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --testres 1.2 --refine
python eval/eval_seg.py   --path weights/$modelname/  --dataset 2015
python eval/eval_sf.py   --path weights/$modelname/  --dataset 2015
modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset sintel_mrflow_val --datapath path-to-sintel-training   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --testres 1.5 --refine
python eval/eval_seg.py   --path weights/$modelname/  --dataset sintel
python eval/eval_sf.py   --path weights/$modelname/  --dataset sintel

Run and evaluate kitti-sceneflow (stereo setup, Tab. 6),

modelname=rigidmask-kitti
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015 --datapath path-to-kitti-sceneflow-images   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --disp_path input/disp/kittisf-train-hsm-disp/ --fac 2 --maxdisp 512 --refine --sensor stereo
python eval/eval_seg.py   --path weights/$modelname/  --dataset 2015
python eval/eval_sf.py    --path weights/$modelname/  --dataset 2015

To generate results for kitti-sceneflow benchmark (stereo setup, Tab. 3),

modelname=rigidmask-kitti
mkdir ./benchmark_output
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015test --datapath path-to-kitti-sceneflow-images  --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --disp_path input/disp/kittisf-test-ganet-disp/ --fac 2 --maxdisp 512 --refine --sensor stereo

Training (todo)

Acknowledge (incomplete)

Causal Imitative Model for Autonomous Driving

Causal Imitative Model for Autonomous Driving Mohammad Reza Samsami, Mohammadhossein Bahari, Saber Salehkaleybar, Alexandre Alahi. arXiv 2021. [Projec

VITA lab at EPFL 8 Oct 04, 2022
Data-depth-inference - Data depth inference with python

Welcome! This readme will guide you through the use of the code in this reposito

Marco 3 Feb 08, 2022
Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.

TensorFlow2-GAN Collection of tf2.0 implementations of Generative Adversarial Network varieties presented in research papers. Model architectures will

41 Apr 28, 2022
TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments for IV 2022.

TorchGRL TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffi

XXQQ 42 Dec 09, 2022
AI-UPV at IberLEF-2021 EXIST task: Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models

AI-UPV at IberLEF-2021 EXIST task: Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models Descrip

Angel de Paula 1 Jun 08, 2022
DANet for Tabular data classification/ regression.

Deep Abstract Networks A PyTorch code implemented for the submission DANets: Deep Abstract Networks for Tabular Data Classification and Regression. Do

Ronnie Rocket 55 Sep 14, 2022
A Tensorflow implementation of CapsNet based on Geoffrey Hinton's paper Dynamic Routing Between Capsules

CapsNet-Tensorflow A Tensorflow implementation of CapsNet based on Geoffrey Hinton's paper Dynamic Routing Between Capsules Notes: The current version

Huadong Liao 3.8k Dec 29, 2022
A simple log parser and summariser for IIS web server logs

IISLogFileParser A basic parser tool for IIS Logs which summarises findings from the log file. Inspired by the Gist https://gist.github.com/wh13371/e7

2 Mar 26, 2022
EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation

EFENet EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation Code is a bit messy now. I woud clean up soon. For training the EF

Yaping Zhao 19 Nov 05, 2022
A full pipeline AutoML tool for tabular data

HyperGBM Doc | 中文 We Are Hiring! Dear folks,we are offering challenging opportunities located in Beijing for both professionals and students who are k

DataCanvas 240 Jan 03, 2023
Unofficial pytorch-lightning implement of Mip-NeRF

mipnerf_pl Unofficial pytorch-lightning implement of Mip-NeRF, Here are some results generated by this repository (pre-trained models are provided bel

Jianxin Huang 159 Dec 23, 2022
Churn-Prediction-Project - In this project, a churn prediction model is developed for a private bank as a term project for Data Mining class.

Churn-Prediction-Project In this project, a churn prediction model is developed for a private bank as a term project for Data Mining class. Project in

1 Jan 03, 2022
PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages

PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages Abstract NLP applications for code-mixed (CM) or mix-li

Mohsin Ali, Mohammed 1 Nov 12, 2021
This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX

The goal of Project CodeNet is to provide the AI-for-Code research community with a large scale, diverse, and high quality curated dataset to drive innovation in AI techniques.

International Business Machines 1.2k Jan 04, 2023
This repository contains a Ruby API for utilizing TensorFlow.

tensorflow.rb Description This repository contains a Ruby API for utilizing TensorFlow. Linux CPU Linux GPU PIP Mac OS CPU Not Configured Not Configur

somatic labs 825 Dec 26, 2022
Use graph-based analysis to re-classify stocks and to improve Markowitz portfolio optimization

Dynamic Stock Industrial Classification Use graph-based analysis to re-classify stocks and experiment different re-classification methodologies to imp

Sheng Yang 10 Dec 05, 2022
TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation Zhaoyun Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li

DamoCV 25 Dec 16, 2022
CLIP+FFT text-to-image

Aphantasia This is a text-to-image tool, part of the artwork of the same name. Based on CLIP model, with FFT parameterizer from Lucent library as a ge

vadim epstein 690 Jan 02, 2023
Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer

Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer This repository contains the PyTorch code for Evo-ViT. This work proposes a slow-fas

YifanXu 53 Dec 05, 2022
An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.

relational-rnn-pytorch An implementation of DeepMind's Relational Recurrent Neural Networks (Santoro et al. 2018) in PyTorch. Relational Memory Core (

Sang-gil Lee 241 Nov 18, 2022