PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Overview

Neural Scene Flow Fields

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

[Project Website] [Paper] [Video]

Dependency

The code is tested with Python3, Pytorch >= 1.6 and CUDA >= 10.2, the dependencies includes

  • configargparse
  • matplotlib
  • opencv
  • scikit-image
  • scipy
  • cupy
  • imageio.
  • tqdm

Video preprocessing

  1. Download nerf_data.zip from link, an example input video with SfM camera poses and intrinsics estimated from COLMAP (Note you need to use COLMAP "colmap image_undistorter" command to undistort input images to get "dense" folder as shown in the example, this dense folder should include "images" and "sparse" folders).

  2. Download single view depth prediction model "model.pt" from link, and put it on the folder "nsff_scripts".

  3. Run the following commands to generate required inputs for training/inference:

    # Usage
    cd nsff_scripts
    # create camera intrinsics/extrinsic format for NSFF, same as original NeRF where it uses imgs2poses.py script from the LLFF code: https://github.com/Fyusion/LLFF/blob/master/imgs2poses.py
    python save_poses_nerf.py --data_path "/home/xxx/Neural-Scene-Flow-Fields/kid-running/dense/"
    # Resize input images and run single view model
    python run_midas.py --data_path "/home/xxx/Neural-Scene-Flow-Fields/kid-running/dense/" --input_w 640 --input_h 360 --resize_height 288
    # Run optical flow model (for easy setup and Pytorch version consistency, we use RAFT as backbond optical flow model, but should be easy to change to other models such as PWC-Net or FlowNet2.0)
    ./download_models.sh
    python run_flows_video.py --model models/raft-things.pth --data_path /home/xxx/Neural-Scene-Flow-Fields/kid-running/dense/ --epi_threhold 1.0 --input_flow_w 768 --input_semantic_w 1024 --input_semantic_h 576

Rendering from an example pretrained model

  1. Download pretraind model "kid-running_ndc_5f_sv_of_sm_unify3_F00-30.zip" from link. Unzipping and putting it in the folder "nsff_exp/logs/kid-running_ndc_5f_sv_of_sm_unify3_F00-30/360000.tar".

Set datadir in config/config_kid-running.txt to the root directory of input video. Then go to directory "nsff_exp":

   cd nsff_exp
  1. Rendering of fixed time, viewpoint interpolation
   python run_nerf.py --config configs/config_kid-running.txt --render_bt --target_idx 10

By running the example command, you should get the following result: Alt Text

  1. Rendering of fixed viewpoint, time interpolation
   python run_nerf.py --config configs/config_kid-running.txt --render_lockcam_slowmo --target_idx 8

By running the example command, you should get the following result: Alt Text

  1. Rendering of space-time interpolation
   python run_nerf.py --config configs/config_kid-running.txt --render_slowmo_bt  --target_idx 10

By running the example command, you should get the following result: Alt Text

Training

  1. In configs/config_kid-running.txt, modifying expname to any name you like (different from the original one), and running the following command to train the model:
    python run_nerf.py --config configs/config_kid-running.txt

The per-scene training takes ~2 days using 2 Nvidia V100 GPUs.

  1. Several parameters in config files you might need to know for training a good model
  • N_samples: in order to render images with higher resolution, you have to increase number sampled points
  • start_frame, end_frame: indicate training frame range. The default model usually works for video of 1~2s. Training on longer frames can cause oversmooth rendering. To mitigate the effect, you can increase the capacity of the network by increasing netwidth (but it can drastically increase training time and memory usage).
  • decay_iteration: number of iteartion in initialization stage. Data-driven losses will decay every 1000*decay_iteration steps. It's usually good to match decay_iteration to the number of training frames.
  • no_ndc: our current implementation only supports reconstruction in NDC space, meaning it only works for forward-facing scene like original NeRF. But it should be not hard to adapt to euclidean space.
  • use_motion_mask, num_extra_sample: whether to use estimated coarse motion segmentation mask to perform hard-mining sampling during initialization stage, and how many extra samples during initialization stage.
  • w_depth, w_optical_flow: weight of losses for single-view depth and geometry consistency priors described in the paper
  • w_cycle: weights of scene flow cycle consistency loss
  • w_sm: weight of scene flow smoothness loss
  • w_prob_reg: weight of disocculusion weight regularization

Evaluation on the Dynamic Scene Dataset

  1. Download Dynamic Scene dataset "dynamic_scene_data_full.zip" from link

  2. Download pretrained model "dynamic_scene_pretrained_models.zip" from link, unzip and put them in the folder "nsff_exp/logs/"

  3. Run the following command for each scene to get quantitative results reported in the paper:

   # Usage: configs/config_xxx.txt indicates each scene name such as config_balloon1-2.txt in nsff/configs
   python evaluation.py --config configs/config_xxx.txt
  • Note: you have to use modified LPIPS implementation included in this branch in order to measure LIPIS error for dynamic region only as described in the paper.

Acknowledgment

The code is based on implementation of several prior work:

License

This repository is released under the MIT license.

Citation

If you find our code/models useful, please consider citing our paper:

@article{li2020neural,
  title={Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes},
  author={Li, Zhengqi and Niklaus, Simon and Snavely, Noah and Wang, Oliver},
  journal={arXiv preprint arXiv:2011.13084},
  year={2020}
}
Owner
Zhengqi Li
CS Ph.D. student at Cornell University/Cornell Tech
Zhengqi Li
Pytorch implementation of the paper "Class-Balanced Loss Based on Effective Number of Samples"

Class-balanced-loss-pytorch Pytorch implementation of the paper Class-Balanced Loss Based on Effective Number of Samples presented at CVPR'19. Yin Cui

Vandit Jain 697 Dec 29, 2022
CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability

This is the official repository of the paper: CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability A private copy of the

Fadi Boutros 33 Dec 31, 2022
Disagreement-Regularized Imitation Learning

Due to a normalization bug the expert trajectories have lower performance than the rl_baseline_zoo reported experts. Please see the following link in

Kianté Brantley 25 Apr 28, 2022
Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision. ICCV 2021.

Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision Download links and PyTorch implementation of "Towers of Ba

Blakey Wu 40 Dec 14, 2022
[NeurIPS 2020] Official Implementation: "SMYRF: Efficient Attention using Asymmetric Clustering".

SMYRF: Efficient attention using asymmetric clustering Get started: Abstract We propose a novel type of balanced clustering algorithm to approximate a

Giannis Daras 46 Dec 22, 2022
Contrastive Learning for Compact Single Image Dehazing, CVPR2021

AECR-Net Contrastive Learning for Compact Single Image Dehazing, CVPR2021. Official Pytorch based implementation. Paper arxiv Pytorch Version TODO: mo

glassy 253 Jan 01, 2023
ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

Ibai Gorordo 18 Nov 06, 2022
Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

TableParser Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at DS3 Lab 11 Dec 13, 2022

PyTorch code for the ICCV'21 paper: "Always Be Dreaming: A New Approach for Class-Incremental Learning"

Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning PyTorch code for the ICCV 2021 paper: Always Be Dreaming: A New Approach f

49 Dec 21, 2022
phylotorch-bito is a package providing an interface to BITO for phylotorch

phylotorch-bito phylotorch-bito is a package providing an interface to BITO for phylotorch Dependencies phylotorch BITO Installation Get the source co

Mathieu Fourment 2 Sep 01, 2022
Spearmint Bayesian optimization codebase

Spearmint Spearmint is a software package to perform Bayesian optimization. The Software is designed to automatically run experiments (thus the code n

Formerly: Harvard Intelligent Probabilistic Systems Group -- Now at Princeton 1.5k Dec 29, 2022
Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

Fine-grained Post-training for Multi-turn Response Selection Implements the model described in the following paper Fine-grained Post-training for Impr

Janghoon Han 83 Dec 20, 2022
Web-interface + rest API for classification and regression (https://jeff1evesque.github.io/machine-learning.docs)

Machine Learning This project provides a web-interface, as well as a programmatic-api for various machine learning algorithms. Supported algorithms: S

Jeff Levesque 252 Dec 11, 2022
Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch

Automatic Number Plate Recognition Automatic Number Plate Recognition (ANPR) is the process of reading the characters on the plate with various optica

Meftun AKARSU 52 Dec 22, 2022
DA2Lite is an automated model compression toolkit for PyTorch.

DA2Lite (Deep Architecture to Lite) is a toolkit to compress and accelerate deep network models. ⭐ Star us on GitHub — it helps!! Frameworks & Librari

Sinhan Kang 7 Mar 22, 2022
Realtime_Multi-Person_Pose_Estimation

Introduction Multi Person PoseEstimation By PyTorch Results Require Pytorch Installation git submodule init && git submodule update Demo Download conv

tensorboy 1.3k Jan 05, 2023
Jupyter notebooks for the code samples of the book "Deep Learning with Python"

Jupyter notebooks for the code samples of the book "Deep Learning with Python"

François Chollet 16.2k Dec 30, 2022
Official implementation of YOGO for Point-Cloud Processing

You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module By Chenfeng Xu, Bohan Zhai, Bichen Wu, T

Chenfeng Xu 67 Dec 20, 2022
This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.

Code-and-Dataset-for-CapSal This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detec

lu zhang 48 Aug 19, 2022
Bayesian algorithm execution (BAX)

Bayesian Algorithm Execution (BAX) Code for the paper: Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mut

Willie Neiswanger 38 Dec 08, 2022