Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image

Overview

Inverse Rendering for Complex Indoor Scenes:
Shape, Spatially-Varying Lighting and SVBRDF
From a Single Image
(Project page)

Zhengqin Li, Mohammad Shafiei, Ravi Ramamoorthi, Kalyan Sunkavalli, Manmohan Chandraker

Useful links:

Results on our new dataset

This is the official code release of paper Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image. The original models were trained by extending the SUNCG dataset with an SVBRDF-mapping. Since SUNCG is not available now due to copyright issues, we are not able to release the original models. Instead, we rebuilt a new high-quality synthetic indoor scene dataset and trained our models on it. We will release the new dataset in the near future. The geometry configurations of the new dataset are based on ScanNet [1], which is a large-scale repository of 3D scans of real indoor scenes. Some example images can be found below. A video is at this link Insverse rendering results of the models trained on the new datasets are shown below. Scene editing applications results on real images are shown below, including results on object insertion and material editing. Models trained on the new dataset achieve comparable performances compared with our previous models. Quantitaive comparisons are listed below, where [Li20] represents our previous models trained on the extended SUNCG dataset.

Download the trained models

The trained models can be downloaded from the link. To test the models, please copy the models to the same directory as the code and run the commands as shown below.

Train and test on the synthetic dataset

To train the full models on the synthetic dataset, please run the commands

  • python trainBRDF.py --cuda --cascadeLevel 0 --dataRoot DATA: Train the first cascade of MGNet.
  • python trainLight.py --cuda --cascadeLevel 0 --dataRoot DATA: Train the first cascade of LightNet.
  • python trainBRDFBilateral.py --cuda --cascadeLevel 0 --dataRoot DATA: Train the bilateral solvers.
  • python outputBRDFLight.py --cuda --dataRoot DATA: Output the intermediate predictions, which will be used to train the second cascade.
  • python trainBRDF.py --cuda --cascadeLevel 1 --dataRoot DATA: Train the first cascade of MGNet.
  • python trainLight.py --cuda --cascadeLevel 1 --dataRoot DATA: Train the first cascade of LightNet.
  • python trainBRDFBilateral.py --cuda --cascadeLevel 1 --dataRoot DATA: Train the bilateral solvers.

To test the full models on the synthetic dataset, please run the commands

  • python testBRDFBilateral.py --cuda --dataRoot DATA: Test the BRDF and geometry predictions.
  • python testLight.py --cuda --cascadeLevel 0 --dataRoot DATA: Test the light predictions of the first cascade.
  • python testLight.py --cuda --cascadeLevel 1 --dataRoot DATA: Test the light predictions of the first cascade.

Train and test on IIW dataset for intrinsic decomposition

To train on the IIW dataset, please first train on the synthetic dataset and then run the commands:

  • python trainFineTuneIIW.py --cuda --dataRoot DATA --IIWRoot IIW: Fine-tune the network on the IIW dataset.

To test the network on the IIW dataset, please run the commands

  • bash runIIW.sh: Output the predictions for the IIW dataset.
  • python CompareWHDR.py: Compute the WHDR on the predictions.

Please fixing the data route in runIIW.sh and CompareWHDR.py.

Train and test on NYU dataset for geometry prediction

To train on the BYU dataset, please first train on the synthetic dataset and then run the commands:

  • python trainFineTuneNYU.py --cuda --dataRoot DATA --NYURoot NYU: Fine-tune the network on the NYU dataset.
  • python trainFineTuneNYU_casacde1.py --cuda --dataRoot DATA --NYURoot NYU: Fine-tune the network on the NYU dataset.

To test the network on the NYU dataset, please run the commands

  • bash runNYU.sh: Output the predictions for the NYU dataset.
  • python CompareNormal.py: Compute the normal error on the predictions.
  • python CompareDepth.py: Compute the depth error on the predictions.

Please remember fixing the data route in runNYU.sh, CompareNormal.py and CompareDepth.py.

Train and test on Garon19 [2] dataset for object insertion

There is no fine-tuning for the Garon19 dataset. To test the network, download the images from this link. And then run bash runReal20.sh. Please remember fixing the data route in runReal20.sh.

All object insertion results and comparisons with prior works can be found from this link. The code to run object insertion can be found from this link.

Differences from the original paper

The current implementation has 3 major differences from the original CVPR20 implementation.

  • In the new models, we do not use spherical Gaussian parameters generated from optimization for supervision. That is mainly because the optimization proceess is time consuming and we have not finished that process yet. We will update the code once it is done. The performance with spherical Gaussian supervision is expected to be better.
  • The resolution of the second cascade is changed from 480x640 to 240x320. We find that the networks can generate smoother results with smaller resolution.
  • We remove the light source segmentation mask as an input. It does not have a major impact on the final results.

Reference

[1] Dai, A., Chang, A. X., Savva, M., Halber, M., Funkhouser, T., & Nießner, M. (2017). Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5828-5839).

[2] Garon, M., Sunkavalli, K., Hadap, S., Carr, N., & Lalonde, J. F. (2019). Fast spatially-varying indoor lighting estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 6908-6917).

This is Unofficial Repo. Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery Detection (CVPR 2021)

Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery Detection This is a PyTorch implementation of the LipForensics paper. This is an U

Minha Kim 2 May 11, 2022
Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

GraphMask This repository contains an implementation of GraphMask, the interpretability technique for graph neural networks presented in our ICLR 2021

Michael Schlichtkrull 29 Sep 02, 2022
Sdf sparse conv - Deep Learning on SDF for Classifying Brain Biomarkers

Deep Learning on SDF for Classifying Brain Biomarkers To reproduce the results f

1 Jan 25, 2022
A Structured Self-attentive Sentence Embedding

Structured Self-attentive sentence embeddings Implementation for the paper A Structured Self-Attentive Sentence Embedding, which was published in ICLR

Kaushal Shetty 488 Nov 28, 2022
[ECCV2020] Content-Consistent Matching for Domain Adaptive Semantic Segmentation

[ECCV20] Content-Consistent Matching for Domain Adaptive Semantic Segmentation This is a PyTorch implementation of CCM. News: GTA-4K list is available

Guangrui Li 88 Aug 25, 2022
[KDD 2021, Research Track] DiffMG: Differentiable Meta Graph Search for Heterogeneous Graph Neural Networks

DiffMG This repository contains the code for our KDD 2021 Research Track paper: DiffMG: Differentiable Meta Graph Search for Heterogeneous Graph Neura

AutoML Research 24 Nov 29, 2022
This is the source code for the experiments related to the paper Unsupervised Audio Source Separation Using Differentiable Parametric Source Models

Unsupervised Audio Source Separation Using Differentiable Parametric Source Models This is the source code for the experiments related to the paper Un

30 Oct 19, 2022
Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)

This repository contains code to reproduce results for submission NeurIPS 2021, "Momentum Centering and Asynchronous Update for Adaptive Gradient Meth

Juntang Zhuang 15 Jun 11, 2022
General Vision Benchmark, a project from OpenGVLab

Introduction We build GV-B(General Vision Benchmark) on Classification, Detection, Segmentation and Depth Estimation including 26 datasets for model e

174 Dec 27, 2022
Steerable discovery of neural audio effects

Steerable discovery of neural audio effects Christian J. Steinmetz and Joshua D. Reiss Abstract Applications of deep learning for audio effects often

Christian J. Steinmetz 182 Dec 29, 2022
Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds."

DeltaConv [Paper] [Project page] Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds" by Ru

98 Nov 26, 2022
Extracting and filtering paraphrases by bridging natural language inference and paraphrasing

nli2paraphrases Source code repository accompanying the preprint Extracting and filtering paraphrases by bridging natural language inference and parap

Matej Klemen 1 Mar 09, 2022
Nest - A flexible tool for building and sharing deep learning modules

Nest - A flexible tool for building and sharing deep learning modules Nest is a flexible deep learning module manager, which aims at encouraging code

ZhouYanzhao 41 Oct 10, 2022
ANEA: Automated (Named) Entity Annotation for German Domain-Specific Texts

ANEA The goal of Automatic (Named) Entity Annotation is to create a small annotated dataset for NER extracted from German domain-specific texts. Insta

Anastasia Zhukova 2 Oct 07, 2022
Official repository of the paper "GPR1200: A Benchmark for General-PurposeContent-Based Image Retrieval"

GPR1200 Dataset GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval (ArXiv) Konstantin Schall, Kai Uwe Barthel, Nico Hezel, Klaus J

Visual Computing Group 16 Nov 21, 2022
The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Swin-Unet The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"(https://arxiv.org/abs/2105.05537). A validatio

869 Jan 07, 2023
A PyTorch implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? Source: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize

Caiyong Wang 14 Sep 20, 2022
HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

Code for HDR Video Reconstruction HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021) Guanying Chen, Cha

Guanying Chen 64 Nov 19, 2022
Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

Rubicon Purpose Rubicon is a data science tool that captures and stores model training and execution information, like parameters and outcomes, in a r

Capital One 97 Jan 03, 2023
Modified fork of Xuebin Qin's U-2-Net Repository. Used for demonstration purposes.

U^2-Net (U square net) Modified version of U2Net used for demonstation purposes. Paper: U^2-Net: Going Deeper with Nested U-Structure for Salient Obje

Shreyas Bhat Kera 13 Aug 28, 2022