Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Last update: May 18, 2022

Related tags

Deep Learning BlockGAN

Overview

BlockGAN

Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images
Thu Nguyen-Phuoc, Chrisian Richardt, Long Mai, Yong-liang Yang, Niloy Mitra

Dataset

Please contact Thu Nguyen-Phuoc for datasets.

Training

To run the training of BlockGAN

python main.py ./config_synthetic.json --dataset Chair --input_fname_pattern ".png" 

python main.py ./config_real.json --dataset Car --input_fname_pattern ".jpg"

Help with config.json

image_path:
			Full path to the dataset directory.
gpu:
			Index number of the GPU to use. Default: 0.
batch_size:
			Batch size. Defaults is 32.
max_epochs:
			Number of epochs to train. Defaults is 50.
epoch_step:
			Number of epochs to train before starting to decrease the learning rate. Default is 25.
z_dim:
			Dimension of the noise vector. Defaults is 90.
z_dim2:
			Dimension of the noise vector. Defaults is 30.			
d_eta:
			Learning rate of the discriminator.Default is 0.0001
g_eta:
			Learning rate of the generator.Default is 0.0001
reduce_eta:
			Reduce learning rate during training.Default is False
D_update:
			Number of updates for the Discriminator for every training step.Default is 1.
G_update:
			Number of updates for the Generator for every training step.Default is 2.
beta1:
			Beta 1 for the Adam optimiser. Default is 0.5
beta2:
			Beta 2 for the Adam optimiser. Default is 0.999
discriminator:
			Name of the discriminator to use. 
generator:
			Name of the generator to use. 
view_func:
			Name of the view sampling function to use.
skew_func:
			Name of the perspective skew function to use.
train_func:
			Name of the train function to use.
build_func:
			Name of the build function to use.
style_disc:
			Use Style discriminator. Useful for training images at 128.
sample_z:
			Distribution to sample the noise fector. Default is "uniform".
add_D_noise:
			Add noise to the input of the discriminator. Default is "false".
DStyle_lambda:
			Lambda for the style discriminator loss. Default is 1.0
ele_low:
    		        Default is 70.
ele_high:
			Default is 110.
azi_low:
			Default is 0.
azi_high:
			Default is 360.
scale_low:
			Default is 1.0
scale_high:
			Default is 1.0
x_low:
			Default is 0.
x_high:
			Default is 0.
y_low:
			Default is 0.
y_high:
			Default is 0.
z_low:
			Default is 0.
z_high:
			Default is 0.
with_translation:
			To use translation in 3D transformation. Default is "true".
with_scale:
			To use scaling in 3D transformation. Default is "true".
focal_length:
			Camera parameter. Default is 35.
sensor_size:
			Camera parameter. Default is 32.
camera_dist:
			Camera distance. Default is 11.
new_size:
			Voxel grid size. Default is 16.	
size:
			Voxel grid size. Default is 16.	
output_dir: 
			Full path to the output directory.

Citation

If you use this code for your research, please cite our paper

@inproceedings{BlockGAN2020,
  title={ BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images  },
  author={Nguyen-Phuoc, Thu and Richardt, Christian and Mai, Long and Yang, Yong-Liang and Mitra, Niloy},
  booktitle =  {Advances in Neural Information Processing Systems 33},
 month = {Nov},
 year = {2020}
}

Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Related tags

Overview

BlockGAN

Dataset

Training

Citation

Owner

DeepOBS: A Deep Learning Optimizer Benchmark Suite

SberSwap Video Swap base on deep learning

Solver for Large-Scale Rank-One Semidefinite Relaxations

Harmonious Textual Layout Generation over Natural Images via Deep Aesthetics Learning

Categorical Depth Distribution Network for Monocular 3D Object Detection

This is a repository with the code for the ACL 2019 paper

Code and data for "TURL: Table Understanding through Representation Learning"

This repo provides the base code for pytorch-lightning and weight and biases simultaneous integration.

MEDS: Enhancing Memory Error Detection for Large-Scale Applications

Code release for Local Light Field Fusion at SIGGRAPH 2019

BT-Unet: A-Self-supervised-learning-framework-for-biomedical-image-segmentation-using-Barlow-Twins

Code for BMVC2021 paper "Boundary Guided Context Aggregation for Semantic Segmentation"

Just-Now - This Is Just Now Login Friendlist Cloner Tools

A pytorch-based real-time segmentation model for autonomous driving

Train emoji embeddings based on emoji descriptions.

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

A video scene detection algorithm is designed to detect a variety of different scenes within a video

Pyramid Pooling Transformer for Scene Understanding

Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"

Supplementary code for the AISTATS 2021 paper "Matern Gaussian Processes on Graphs".