Demo code for paper "Learning optical flow from still images", CVPR 2021.

Overview

Depthstillation

Demo code for "Learning optical flow from still images", CVPR 2021.

[Project page] - [Paper] - [Supplementary]

This code is provided to replicate the qualitative results shown in the supplementary material, Sections 2-4. The code has been tested using Ubuntu 20.04 LTS, python 3.8 and gcc 9.3.0

Alt text

Reference

If you find this code useful, please cite our work:

@inproceedings{Aleotti_CVPR_2021,
  title     = {Learning optical flow from still images},
  author    = {Aleotti, Filippo and
               Poggi, Matteo and
               Mattoccia, Stefano},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021}
}

Contents

  1. Introduction
  2. Usage
  3. Supplementary
  4. Weights
  5. Contacts
  6. Acknowledgments

Introduction

This paper deals with the scarcity of data for training optical flow networks, highlighting the limitations of existing sources such as labeled synthetic datasets or unlabeled real videos. Specifically, we introduce a framework to generate accurate ground-truth optical flow annotations quickly and in large amounts from any readily available single real picture. Given an image, we use an off-the-shelf monocular depth estimation network to build a plausible point cloud for the observed scene. Then, we virtually move the camera in the reconstructed environment with known motion vectors and rotation angles, allowing us to synthesize both a novel view and the corresponding optical flow field connecting each pixel in the input image to the one in the new frame. When trained with our data, state-of-the-art optical flow networks achieve superior generalization to unseen real data compared to the same models trained either on annotated synthetic datasets or unlabeled videos, and better specialization if combined with synthetic images.

Usage

Install the project requirements in a new python 3 environment:

virtualenv -p python3 learning_flow_env
source learning_flow_env/bin/activate
pip install -r requirements.txt

Compile the forward_warping module, written in C (required to handle warping collisions):

cd external/forward_warping
bash compile.sh
cd ../..

You are now ready to run the depthstillation.py script:

python depthstillation.py 

By switching some parameters you can generate all the qualitatives provided in the supplementary material.

These parameters are:

  • num_motions: changes the number of virtual motions
  • segment: enables instance segmentation (for independently moving objects)
  • mask_type: mask selection. Options are H' and H
  • num_objects: sets the number of independently moving objects (one, in this example)
  • no_depth: disables monocular depth and force depth to assume a constant value
  • no_sharp: disables depth sharpening
  • change_k: uses different intrinsics K
  • change_motion: samples a different motion (ignored if num_motions greater than 1)

For instance, to simulate a different K settings, just run:

python depthstillation.py --change_k

The results are saved in dCOCO folder, organized as follows:

  • depth_color: colored depth map
  • flow: generated flow labels (in 16bit KITTI format)
  • flow_color: colored flow labels
  • H: H mask
  • H': H' mask
  • im0: real input image
  • im1: generated virtual image
  • im1_raw: generated virtual image (pre-inpainting)
  • instances_color: colored instance map (if --segment is enabled)
  • M: M mask
  • M': M' mask
  • P: P mask

We report the list of files used to depthstill dCOCO in samples/dCOCO_file_list.txt

Supplementary

We report here the list of commands to obtain, in the same order, the Figures shown in Sections 2-4 of the Supplementary Material:

  • Section 2 -- the first figure is obtained with default parameters, then we use --no_depth and --no_depth --segment respectively
  • Section 3 -- the first figure is obtained with --no_sharp, the remaining figures with default parameters or by setting --mask_type "H".
  • Section 4 -- we show three times the results obtained by default parameters, followed respectively by figures generated using --change_k, --change_motion and --segment individually.

Weights

We provide RAFT models trained in our experiments. To run them and reproduce our results, please refer to RAFT repository:

Contacts

m [dot] poggi [at] unibo [dot] it

Acknowledgments

Thanks to Clément Godard and Niantic for sharing monodepth2 code, used to simulate camera motion.

Our work is inspired by Jamie Watson et al., Learning Stereo from Single Images.

Code for Efficient Visual Pretraining with Contrastive Detection

Code for DetCon This repository contains code for the ICCV 2021 paper "Efficient Visual Pretraining with Contrastive Detection" by Olivier J. Hénaff,

DeepMind 56 Nov 13, 2022
SAPIEN Manipulation Skill Benchmark

ManiSkill Benchmark SAPIEN Manipulation Skill Benchmark (abbreviated as ManiSkill, pronounced as "Many Skill") is a large-scale learning-from-demonstr

Hao Su's Lab, UCSD 107 Jan 08, 2023
Simulate genealogical trees and genomic sequence data using population genetic models

msprime msprime is a population genetics simulator based on tskit. Msprime can simulate random ancestral histories for a sample of individuals (consis

Tskit developers 150 Dec 14, 2022
This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

Non-autoregressive Deep Learning-Based TTS Template This is a template for the Non-autoregressive TTS model. It contains Data Preprocessing Pipeline D

Keon Lee 13 Dec 05, 2022
Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implicit3DUnderstanding (Im3D) [Project Page] Holistic 3D Scene Understanding from a Single Image with Implicit Representation Cheng Zhang, Zhaopeng C

Cheng Zhang 149 Jan 08, 2023
This is official implementaion of paper "Token Shift Transformer for Video Classification".

This is official implementaion of paper "Token Shift Transformer for Video Classification". We achieve SOTA performance 80.40% on Kinetics-400 val. Paper link

VideoNet 60 Dec 30, 2022
Neurolab is a simple and powerful Neural Network Library for Python

Neurolab Neurolab is a simple and powerful Neural Network Library for Python. Contains based neural networks, train algorithms and flexible framework

152 Dec 06, 2022
Monitora la qualità della ricezione dei segnali radio nelle province siciliane.

FMap-server Monitora la qualità della ricezione dei segnali radio nelle province siciliane. Conversion data Frequency - StationName maps are stored in

Triglie 5 May 24, 2021
MiniSom is a minimalistic implementation of the Self Organizing Maps

MiniSom Self Organizing Maps MiniSom is a minimalistic and Numpy based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial N

Giuseppe Vettigli 1.2k Jan 03, 2023
Generating synthetic mobility data for a realistic population with RNNs to improve utility and privacy

lbs-data Motivation Location data is collected from the public by private firms via mobile devices. Can this data also be used to serve the public goo

Alex 11 Sep 22, 2022
CS50x-AI - Artificial Intelligence with Python from Harvard University

CS50x-AI Artificial Intelligence with Python from Harvard University 📖 Table of

Hosein Damavandi 6 Aug 22, 2022
Simply enable or disable your Nvidia dGPU

EnvyControl (WIP) Simply enable or disable your Nvidia dGPU Usage First clone this repo and install envycontrol with sudo pip install . CLI Turn off y

Victor Bayas 292 Jan 03, 2023
Multi-tool reverse engineering collaboration solution.

CollaRE v0.3 Intorduction CollareRE is a tool for collaborative reverse engineering that aims to allow teams that do need to use more then one tool du

105 Nov 27, 2022
Implementation of the GVP-Transformer, which was used in the paper "Learning inverse folding from millions of predicted structures" for de novo protein design alongside Alphafold2

GVP Transformer (wip) Implementation of the GVP-Transformer, which was used in the paper Learning inverse folding from millions of predicted structure

Phil Wang 19 May 06, 2022
[CVPR-2021] UnrealPerson: An adaptive pipeline for costless person re-identification

UnrealPerson: An Adaptive Pipeline for Costless Person Re-identification In our paper (arxiv), we propose a novel pipeline, UnrealPerson, that decreas

ZhangTianyu 70 Oct 10, 2022
AdelaiDepth is an open source toolbox for monocular depth prediction.

AdelaiDepth is an open source toolbox for monocular depth prediction.

Adelaide Intelligent Machines (AIM) Group 743 Jan 01, 2023
An official repository for Paper "Uformer: A General U-Shaped Transformer for Image Restoration".

Uformer: A General U-Shaped Transformer for Image Restoration Zhendong Wang, Xiaodong Cun, Jianmin Bao and Jianzhuang Liu Paper: https://arxiv.org/abs

Zhendong Wang 497 Dec 22, 2022
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark We propose a benchmark to evaluate different quantization algorithms on vari

494 Dec 29, 2022
Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

B-Pref Official codebase for B-Pref: Benchmarking Preference-BasedReinforcement Learning contains scripts to reproduce experiments. Install conda env

48 Dec 20, 2022
A working implementation of the Categorical DQN (Distributional RL).

Categorical DQN. Implementation of the Categorical DQN as described in A distributional Perspective on Reinforcement Learning. Thanks to @tudor-berari

Florin Gogianu 98 Sep 20, 2022