BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

Overview

BARF 🤮 : Bundle-Adjusting Neural Radiance Fields

Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey
IEEE International Conference on Computer Vision (ICCV), 2021 (oral presentation)

Project page: https://chenhsuanlin.bitbucket.io/bundle-adjusting-NeRF
arXiv preprint: https://arxiv.org/abs/2104.06405

We provide PyTorch code for the NeRF experiments on both synthetic (Blender) and real-world (LLFF) datasets.


Prerequisites

This code is developed with Python3 (python3). PyTorch 1.9+ is required.
It is recommended use Anaconda to set up the environment. Install the dependencies and activate the environment barf-env with

conda env create --file requirements.yaml python=3
conda activate barf-env

Initialize the external submodule dependencies with

git submodule update --init --recursive

Dataset

  • Synthetic data (Blender) and real-world data (LLFF)

    Both the Blender synthetic data and LLFF real-world data can be found in the NeRF Google Drive. For convenience, you can download them with the following script: (under this repo)
    # Blender
    gdown --id 18JxhpWD-4ZmuFKLzKlAw-w5PpzZxXOcG # download nerf_synthetic.zip
    unzip nerf_synthetic.zip
    rm -f nerf_synthetic.zip
    mv nerf_synthetic data/blender
    # LLFF
    gdown --id 16VnMcF1KJYxN9QId6TClMsZRahHNMW5g # download nerf_llff_data.zip
    unzip nerf_llff_data.zip
    rm -f nerf_llff_data.zip
    mv nerf_llff_data data/llff
    The data directory should contain the subdirectories blender and llff. If you already have the datasets downloaded, you can alternatively soft-link them within the data directory.
  • iPhone (TODO)


Running the code

  • BARF models

    To train and evaluate BARF:

    # <GROUP> and <NAME> can be set to your likes, while <SCENE> is specific to datasets
    
    # Blender (<SCENE>={chair,drums,ficus,hotdog,lego,materials,mic,ship})
    python3 train.py --group=<GROUP> --model=barf --yaml=barf_blender --name=<NAME> --data.scene=<SCENE> --barf_c2f=[0.1,0.5]
    python3 evaluate.py --group=<GROUP> --model=barf --yaml=barf_blender --name=<NAME> --data.scene=<SCENE> --data.val_sub= --resume
    
    # LLFF (<SCENE>={fern,flower,fortress,horns,leaves,orchids,room,trex})
    python3 train.py --group=<GROUP> --model=barf --yaml=barf_llff --name=<NAME> --data.scene=<SCENE> --barf_c2f=[0.1,0.5]
    python3 evaluate.py --group=<GROUP> --model=barf --yaml=barf_llff --name=<NAME> --data.scene=<SCENE> --resume

    All the results will be stored in the directory output/<GROUP>/<NAME>. You may want to organize your experiments by grouping different runs in the same group.

    To train baseline models:

    • Full positional encoding: omit the --barf_c2f argument.
    • No positional encoding: add --arch.posenc!.

    If you want to evaluate a checkpoint at a specific iteration number, use --resume=<ITER_NUMBER> instead of just --resume.

  • Training the original NeRF

    If you want to train the reference NeRF models (assuming known camera poses):

    # Blender
    python3 train.py --group=<GROUP> --model=nerf --yaml=nerf_blender --name=<NAME> --data.scene=<SCENE>
    python3 evaluate.py --group=<GROUP> --model=nerf --yaml=nerf_blender --name=<NAME> --data.scene=<SCENE> --data.val_sub= --resume
    
    # LLFF
    python3 train.py --group=<GROUP> --model=nerf --yaml=nerf_llff --name=<NAME> --data.scene=<SCENE>
    python3 evaluate.py --group=<GROUP> --model=nerf --yaml=nerf_llff --name=<NAME> --data.scene=<SCENE> --resume

    If you wish to replicate the results from the original NeRF paper, use --yaml=nerf_blender_repr or --yaml=nerf_llff_repr instead for Blender or LLFF respectively. There are some differences, e.g. NDC will be used for the LLFF forward-facing dataset. (The reference NeRF models considered in the paper do not use NDC to parametrize the 3D points.)

  • Visualizing the results

    We have included code to visualize the training over TensorBoard and Visdom. The TensorBoard events include the following:

    • SCALARS: the rendering losses and PSNR over the course of optimization. For BARF, the rotational/translational errors with respect to the given poses are also computed.
    • IMAGES: visualization of the RGB images and the RGB/depth rendering.

    We also provide visualization of 3D camera poses in Visdom. Run visdom -port 9000 to start the Visdom server.
    The Visdom host server is default to localhost; this can be overridden with --visdom.server (see options/base.yaml for details). If you want to disable Visdom visualization, add --visdom!.


Codebase structure

The main engine and network architecture in model/barf.py inherit those from model/nerf.py. This codebase is structured so that it is easy to understand the actual parts BARF is extending from NeRF. It is also simple to build your exciting applications upon either BARF or NeRF -- just inherit them again! This is the same for dataset files (e.g. data/blender.py).

To understand the config and command lines, take the below command as an example:

python3 train.py --group=<GROUP> --model=barf --yaml=barf_blender --name=<NAME> --data.scene=<SCENE> --barf_c2f=[0.1,0.5]

This will run model/barf.py as the main engine with options/barf_blender.yaml as the main config file. Note that barf hierarchically inherits nerf (which inherits base), making the codebase customizable.
The complete configuration will be printed upon execution. To override specific options, add --<key>=value or --<key1>.<key2>=value (and so on) to the command line. The configuration will be loaded as the variable opt throughout the codebase.

Some tips on using and understanding the codebase:

  • The computation graph for forward/backprop is stored in var throughout the codebase.
  • The losses are stored in loss. To add a new loss function, just implement it in compute_loss() and add its weight to opt.loss_weight.<name>. It will automatically be added to the overall loss and logged to Tensorboard.
  • If you are using a multi-GPU machine, you can add --gpu=<gpu_number> to specify which GPU to use. Multi-GPU training/evaluation is currently not supported.
  • To resume from a previous checkpoint, add --resume=<ITER_NUMBER>, or just --resume to resume from the latest checkpoint.
  • (to be continued....)

If you find our code useful for your research, please cite

@inproceedings{lin2021barf,
  title={BARF: Bundle-Adjusting Neural Radiance Fields},
  author={Lin, Chen-Hsuan and Ma, Wei-Chiu and Torralba, Antonio and Lucey, Simon},
  booktitle={IEEE International Conference on Computer Vision ({ICCV})},
  year={2021}
}

Please contact me ([email protected]) if you have any questions!

Owner
Chen-Hsuan Lin
Research scientist @NVIDIA, PhD in Robotics @ CMU
Chen-Hsuan Lin
Contains code for Deep Kernelized Dense Geometric Matching

DKM - Deep Kernelized Dense Geometric Matching Contains code for Deep Kernelized Dense Geometric Matching We provide pretrained models and code for ev

Johan Edstedt 83 Dec 23, 2022
Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

Readme File for "Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis" by Ham, Imai, and Janson. (2022) All scripts were written and

0 Jan 27, 2022
Continuous Time LiDAR odometry

CT-ICP: Elastic SLAM for LiDAR sensors This repository implements the SLAM CT-ICP (see our article), a lightweight, precise and versatile pure LiDAR o

385 Dec 29, 2022
Rule Based Classification Project For Python

Rule-Based-Classification-Project (ENG) Business Problem: A game company wants to create new level-based customer definitions (personas) by using some

Deniz Can OÄžUZ 4 Oct 29, 2022
EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos.

EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos. In this project, we provide the basic code for fitt

ZJU3DV 2.2k Jan 05, 2023
End-to-end machine learning project for rices detection

Basmatinet Welcome to this project folks ! Whether you like it or not this project is all about riiiiice or riz in french. It is also about Deep Learn

Béranger 47 Jun 18, 2022
ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021

ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021 Dataset Code Demos Authors: He Zhang, Yuting Ye, Tak

HE ZHANG 194 Dec 06, 2022
A curated list and survey of awesome Vision Transformers.

English | 简体中文 A curated list and survey of awesome Vision Transformers. You can use mind mapping software to open the mind mapping source file. You c

OpenMMLab 281 Dec 21, 2022
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022
SoGCN: Second-Order Graph Convolutional Networks

SoGCN: Second-Order Graph Convolutional Networks This is the authors' implementation of paper "SoGCN: Second-Order Graph Convolutional Networks" in Py

Yuehao 7 Aug 16, 2022
A PyTorch implementation of QANet.

QANet-pytorch NOTICE I'm very busy these months. I'll return to this repo in about 10 days. Introduction An implementation of QANet with PyTorch. Any

H. Z. 343 Nov 03, 2022
SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Wentao Zhu 24 May 20, 2022
SOLOv2 on onnx & tensorRT

SOLOv2.tensorRT: NOTE: code based on WXinlong/SOLO add support to TensorRT inference onnxruntime tensorRT full_dims and dynamic shape postprocess with

47 Nov 26, 2022
A lightweight Python-based 3D network multi-agent simulator. Uses a cell-based congestion model. Calculates risk, loudness and battery capacities of the agents. Suitable for 3D network optimization tasks.

AMAZ3DSim AMAZ3DSim is a lightweight python-based 3D network multi-agent simulator. It uses a cell-based congestion model. It calculates risk, battery

Daniel Hirsch 13 Nov 04, 2022
This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' published at ECIR'22.

Paragraph Aggregation Retrieval Model (PARM) for Dense Document-to-Document Retrieval This repository contains the code for the paper PARM: A Paragrap

Sophia Althammer 33 Aug 26, 2022
Six - a Python 2 and 3 compatibility library

Six is a Python 2 and 3 compatibility library. It provides utility functions for smoothing over the differences between the Python versions with the g

Benjamin Peterson 919 Dec 28, 2022
Multi-Stage Progressive Image Restoration

Multi-Stage Progressive Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Sh

Syed Waqas Zamir 859 Dec 22, 2022
PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection?

PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

Toyota Research Institute - Machine Learning 364 Dec 27, 2022
BARTScore: Evaluating Generated Text as Text Generation

This is the Repo for the paper: BARTScore: Evaluating Generated Text as Text Generation Updates 2021.06.28 Release online evaluation Demo 2021.06.25 R

NeuLab 196 Dec 17, 2022
New approach to benchmark VQA models

VQA Benchmarking This repository contains the web application & the python interface to evaluate VQA models. Documentation Please see the documentatio

4 Jul 25, 2022