Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Overview

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Open in Streamlit Open In Colab

스크린샷 2021-07-04 오후 4 11 51

This project attempted to implement the paper Putting NeRF on a Diet (DietNeRF) in JAX/Flax. DietNeRF is designed for rendering quality novel views in few-shot learning scheme, a task that vanilla NeRF (Neural Radiance Field) struggles. To achieve this, the author coins Semantic Consistency Loss to supervise DietNeRF by prior knowledge from CLIP Vision Transformer. Such supervision enables DietNeRF to learn 3D scene reconstruction with CLIP's prior knowledge on 2D views.

Besides this repo, you can check our write-up and demo here:

🤩 Demo

  1. You can check out our demo in Hugging Face Space
  2. Or you can set up our Streamlit demo locally (model checkpoints will be fetched automatically upon startup)
pip install -r requirements_demo.txt
streamlit run app.py

Streamlit Demo

Implementation

Our code is written in JAX/ Flax and mainly based upon jaxnerf from Google Research. The base code is highly optimized in GPU & TPU. For semantic consistency loss, we utilize pretrained CLIP Vision Transformer from transformers library.

To learn more about DietNeRF, our experiments and implementation, you are highly recommended to check out our very detailed Notion write-up!

스크린샷 2021-07-04 오후 4 11 51

🤗 Hugging Face Model Hub Repo

You can also find our project and our model checkpoints on our Hugging Face Model Hub Repository. The models checkpoints are located in models folder.

Our JAX/Flax implementation currently supports:

Platform Single-Host GPU Multi-Device TPU
Type Single-Device Multi-Device Single-Host Multi-Host
Training Supported Supported Supported Supported
Evaluation Supported Supported Supported Supported

💻 Installation

# Clone the repo
git clone https://github.com/codestella/putting-nerf-on-a-diet
# Create a conda environment, note you can use python 3.6-3.8 as
# one of the dependencies (TensorFlow) hasn't supported python 3.9 yet.
conda create --name jaxnerf python=3.6.12; conda activate jaxnerf
# Prepare pip
conda install pip; pip install --upgrade pip
# Install requirements
pip install -r requirements.txt
# [Optional] Install GPU and TPU support for Jax
# Remember to change cuda101 to your CUDA version, e.g. cuda110 for CUDA 11.0.
!pip install --upgrade jax "jax[cuda110]" -f https://storage.googleapis.com/jax-releases/jax_releases.html
# install flax and flax-transformer
pip install flax transformers[flax]

Dataset

Download the datasets from the NeRF official Google Drive. Please download the nerf_synthetic.zip and unzip them in the place you like. Let's assume they are placed under /tmp/jaxnerf/data/.

🤟 How to Train

  1. Train in our prepared Colab notebook: Colab Pro is recommended, otherwise you may encounter out-of-memory
  2. Train locally: set use_semantic_loss=true in your yaml configuration file to enable DietNeRF.
python -m train \
  --data_dir=/PATH/TO/YOUR/SCENE/DATA \ # (e.g. nerf_synthetic/lego)
  --train_dir=/PATH/TO/THE/PLACE/YOU/WANT/TO/SAVE/CHECKPOINTS \
  --config=configs/CONFIG_YOU_LIKE

💎 Experimental Results

Rendered Rendering images by 8-shot learned DietNeRF

DietNeRF has a strong capacity to generalise on novel and challenging views with EXTREMELY SMALL TRAINING SAMPLES!

HOTDOG / DRUM / SHIP / CHAIR / LEGO / MIC

Rendered GIF by occluded 14-shot learned NeRF and Diet-NeRF

We made artificial occlusion on the right side of image (Only picked left side training poses). The reconstruction quality can be compared with this experiment. DietNeRF shows better quality than Original NeRF when It is occluded.

Training poses

LEGO

Diet NeRF NeRF

SHIP

Diet NeRF NeRF

👨‍👧‍👦 Our Team

Teams Members
Project Managing Stella Yang To Watch Our Project Progress, Please Check Our Project Notion
NeRF Team Stella Yang, Alex Lau, Seunghyun Lee, Hyunkyu Kim, Haswanth Aekula, JaeYoung Chung
CLIP Team Seunghyun Lee, Sasikanth Kotti, Khalid Sifullah , Sunghyun Kim
Cloud TPU Team Alex Lau, Aswin Pyakurel, JaeYoung Chung, Sunghyun Kim

*Special mention to our "night owl" contributors 🦉 : Seunghyun Lee, Alex Lau, Stella Yang, Haswanth Aekula

💞 Social Impact

  • Game Industry
  • Augmented Reality Industry
  • Virtual Reality Industry
  • Graphics Industry
  • Online shopping
  • Metaverse
  • Digital Twin
  • Mapping / SLAM

🌱 References

This project is based on “JAX-NeRF”.

@software{jaxnerf2020github,
  author = {Boyang Deng and Jonathan T. Barron and Pratul P. Srinivasan},
  title = {{JaxNeRF}: an efficient {JAX} implementation of {NeRF}},
  url = {https://github.com/google-research/google-research/tree/master/jaxnerf},
  version = {0.0},
  year = {2020},
}

This project is based on “Putting NeRF on a Diet”.

@misc{jain2021putting,
      title={Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis}, 
      author={Ajay Jain and Matthew Tancik and Pieter Abbeel},
      year={2021},
      eprint={2104.00677},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

🔑 License

Apache License 2.0

❤️ Special Thanks

Our Project is motivated by HuggingFace X GoogleAI (JAX) Community Week Event 2021.

We would like to take this chance to thank Hugging Face for organizing such an amazing open-source initiative, Suraj and Patrick for all the technical help. We learn a lot throughout this wonderful experience!

스크린샷 2021-07-04 오후 4 11 51

Finally, we would like to thank Common Computer AI for sponsoring our team access to V100 multi-GPUs server. Thank you so much for your support!

스크린샷

Owner
Stella Seoyeon Yang's New Github Account for Research. Ph.D. Candidate Student in SNU, CV lab.
This repository includes different versions of the prescribed-time controller as Simulink blocks and MATLAB script codes for engineering applications.

Prescribed-time Control Prescribed-time control (PTC) blocks in Simulink environment, MATLAB R2020b. For more theoretical details, refer to the papers

Amir Shakouri 1 Mar 11, 2022
This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?” Usage To replicate our results in Secti

Albert Webson 64 Dec 11, 2022
Airborne Optical Sectioning (AOS) is a wide synthetic-aperture imaging technique

AOS: Airborne Optical Sectioning Airborne Optical Sectioning (AOS) is a wide synthetic-aperture imaging technique that employs manned or unmanned airc

JKU Linz, Institute of Computer Graphics 39 Dec 09, 2022
The official repo of the CVPR2021 oral paper: Representative Batch Normalization with Feature Calibration

Representative Batch Normalization (RBN) with Feature Calibration The official implementation of the CVPR2021 oral paper: Representative Batch Normali

Open source projects of ShangHua-Gao 76 Nov 09, 2022
The implementation for the SportsCap (IJCV 2021)

SportsCap: Monocular 3D Human Motion Capture and Fine-grained Understanding in Challenging Sports Videos ProjectPage | Paper | Video | Dataset (Part01

Chen Xin 79 Dec 16, 2022
The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

Dinghan Shen 49 Dec 22, 2022
Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

GraspNet Baseline Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020). [paper] [dataset] [API] [do

GraspNet 209 Dec 29, 2022
Hidden-Fold Networks (HFN): Random Recurrent Residuals Using Sparse Supermasks

Hidden-Fold Networks (HFN): Random Recurrent Residuals Using Sparse Supermasks by Ángel López García-Arias, Masanori Hashimoto, Masato Motomura, and J

Ángel López García-Arias 4 May 19, 2022
🧠 A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation.', ECCV 2016

Deep CORAL A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation. B Sun, K Saenko, ECCV 2016' Deep CORAL can learn

Andy Hsu 200 Dec 25, 2022
Serverless proxy for Spark cluster

Hydrosphere Mist Hydrosphere Mist is a serverless proxy for Spark cluster. Mist provides a new functional programming framework and deployment model f

hydrosphere.io 317 Dec 01, 2022
Some pvbatch (paraview) scripts for postprocessing OpenFOAM data

pvbatchForFoam Some pvbatch (paraview) scripts for postprocessing OpenFOAM data For every script there is a help message available: pvbatch pv_state_s

Morev Ilya 2 Oct 26, 2022
Transfer Learning Remote Sensing

Transfer_Learning_Remote_Sensing Simulation R codes for data generation and visualizations are in the folder simulation. Experiment: California Housin

2 Jun 21, 2022
The first dataset on shadow generation for the foreground object in real-world scenes.

Object-Shadow-Generation-Dataset-DESOBA Object Shadow Generation is to deal with the shadow inconsistency between the foreground object and the backgr

BCMI 105 Dec 30, 2022
Face Mask Detection on Image and Video using tensorflow and keras

Face-Mask-Detection Face Mask Detection on Image and Video using tensorflow and keras Train Neural Network on face-mask dataset using tensorflow and k

Nahid Ebrahimian 12 Nov 11, 2022
PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Introduction This repo contains the official PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection. Up

133 Dec 29, 2022
Style-based Neural Drum Synthesis with GAN inversion

Style-based Drum Synthesis with GAN Inversion Demo TensorFlow implementation of a style-based version of the adversarial drum synth (ADS) from the pap

Sound and Music Analysis (SoMA) Group 29 Nov 19, 2022
The implementation of the lifelong infinite mixture model

Lifelong infinite mixture model 📋 This is the implementation of the Lifelong infinite mixture model 📋 Accepted by ICCV 2021 Title : Lifelong Infinit

Fei Ye 5 Oct 20, 2022
A framework to train language models to learn invariant representations.

Invariant Language Modeling Implementation of the training for invariant language models. Motivation Modern pretrained language models are critical co

6 Nov 16, 2022
Repository of 3D Object Detection with Pointformer (CVPR2021)

3D Object Detection with Pointformer This repository contains the code for the paper 3D Object Detection with Pointformer (CVPR 2021) [arXiv]. This wo

Zhuofan Xia 117 Jan 06, 2023
PyTorch Implementation of Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation.

DosGAN-PyTorch PyTorch Implementation of Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation

40 Nov 30, 2022