Code for "ATISS: Autoregressive Transformers for Indoor Scene Synthesis", NeurIPS 2021

Related tags

Deep LearningATISS

ATISS: Autoregressive Transformers for Indoor Scene Synthesis

Example 1 Example 2 Example 3

This repository contains the code that accompanies our paper ATISS: Autoregressive Transformers for Indoor Scene Synthesis.

You can find detailed usage instructions for training your own models, using our pretrained models as well as performing the interactive tasks described in the paper below.

If you found this work influential or helpful for your research, please consider citing

  author = {Despoina Paschalidou and Amlan Kar and Maria Shugrina and Karsten Kreis and Andreas Geiger and Sanja Fidler},
  title = {ATISS: Autoregressive Transformers for Indoor Scene Synthesis},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  year = {2021}

Installation & Dependencies

Our codebase has the following dependencies:

For the visualizations, we use simple-3dviz, which is our easy-to-use library for visualizing 3D data using Python and ModernGL and matplotlib for the colormaps. Note that simple-3dviz provides a lightweight and easy-to-use scene viewer using wxpython. If you wish you use our scripts for visualizing the generated scenes, you will need to also install wxpython. Note that for all the renderings in the paper we used NVIDIA's OMNIVERSE.

The simplest way to make sure that you have all dependencies in place is to use conda. You can create a conda environment called atiss using

conda env create -f environment.yaml
conda activate atiss

Next compile the extension modules. You can do this via

python build_ext --inplace
pip install -e .


To evaluate a pretrained model or train a new model from scratch, you need to obtain the 3D-FRONT and the 3D-FUTURE dataset. To download both datasets, please refer to the instructions provided in the dataset's webpage. As soon as you have downloaded the 3D-FRONT and the 3D-FUTURE dataset, you are ready to start the preprocessing. In addition to a preprocessing script (, we also provide a very useful script for visualising 3D-FRONT scenes (, which you can easily execute by running

python SCENE_ID path_to_output_dir path_to_3d_front_dataset_dir path_to_3d_future_dataset_dir path_to_3d_future_model_info path_to_floor_plan_texture_images

You can also visualize the walls, the windows as well as objects with textures by setting the corresponding arguments. Apart from only visualizing the scene with scene id SCENE_ID, the script also generates a subfolder in the output folder, specified via the path_to_output_dir argument that contains the .obj files as well as the textures of all objects in this scene.

Data Preprocessing

Once you have downloaded the 3D-FRONT and 3D-FUTURE datasets you need to run the script in order to prepare the data to be able to train your own models or generate new scenes using previously trained models. To run the preprocessing script simply run

python path_to_output_dir path_to_3d_front_dataset_dir path_to_3d_future_dataset_dir path_to_3d_future_model_info path_to_floor_plan_texture_images --dataset_filtering threed_front_bedroom

Note that you can choose the filtering for the different room types (e.g. bedrooms, living rooms, dining rooms, libraries) via the dataset_filtering argument. The path_to_floor_plan_texture_images is the path to a folder containing different floor plan textures that are necessary to render the rooms using a top-down orthographic projection. An example of such a folder can be found in the demo\floor_plan_texture_images folder.

This script starts by parsing all scenes from the 3D-FRONT dataset and then for each scene it generates a subfolder inside the path_to_output_dir that contains the information for all objects in the scene (boxes.npz), the room mask (room_mask.png) and the scene rendered using a top-down orthographic_projection (rendered_scene_256.png). Note that for the case of the living rooms and dining rooms you also need to change the size of the room during rendering to 6.2m from 3.1m, which is the default value, via the --room_side argument.

Morover, you will notice that the script takes a significant amount of time to parse all 3D-FRONT scenes. To reduce the waiting time, we cache the parsed scenes and save them to the /tmp/threed_front.pkl file. Therefore, once you parse the 3D-FRONT scenes once you can provide this path in the environment variable PATH_TO_SCENES for the next time you run this script as follows:

PATH_TO_SCENES="/tmp/threed_front.pkl" python path_to_output_dir path_to_3d_front_dataset_dir path_to_3d_future_dataset_dir path_to_3d_future_model_info path_to_floor_plan_texture_images --dataset_filtering threed_front_bedroom

Finally, to further reduce the pre-processing time, note that it is possible to run this script in multiple threads, as it automatically checks whether a scene has been preprocessed and if it is it moves forward to the next scene.


As soon as you have installed all dependencies and have generated the preprocessed data, you can now start training new models from scratch, evaluate our pre-trained models and visualize the generated scenes using one of our pre-trained models. All scripts expect a path to a config file. In the config folder you can find the configuration files for the different room types. Make sure to change the dataset_directory argument to the path where you saved the preprocessed data from before.

Scene Generation

To generate rooms using a previously trained model, we provide the script and you can execute it by running

python path_to_config_yaml path_to_output_dir path_to_3d_future_pickled_data path_to_floor_plan_texture_images --weight_file path_to_weight_file

where the argument --weight_file specifies the path to a trained model and the argument path_to_config_yaml defines the path to the config file used to train that particular model. By default this script randomly selects floor plans from the test set and conditioned on this floor plan it generate different arrangements of objects. Note that if you want to generate a scene conditioned on a specific floor plan, you can select it by providing its scene id via the --scene_id argument. In case you want to run this script headlessly you should set the --without_screen argument. Finally, the path_to_3d_future_pickled_data specifies the path that contains the parsed ThreedFutureDataset after being pickled.

Scene Completion && Object Placement

To perform scene completion, we provide the script that can be executed by running

python path_to_config_yaml path_to_output_dir path_to_3d_future_pickled_data path_to_floor_plan_texture_images --weight_file path_to_weight_file

where the argument --weight_file specifies the path to a trained model and the argument path_to_config_yaml defines the path to the config file used to train that particular model. For this script make sure that the encoding type in the config file has also the word eval in it. By default this script randomly selects a room from the test set and conditioned on this partial scene it populates the empty space with objects. However, you can choose a specific room via the --scene_id argument. This script can be also used to perform object placement. Namely starting from a partial scene add an object of a specific object category.

In the output directory, the script generates two folders for each completion, one that contains the mesh files of the initial partial scene and another one that contains the mesh files of the completed scene.

Object Suggestions

We also provide a script that performs object suggestions based on a user-specified region of acceptable positions. Similar to the previous scripts you can execute by running

python path_to_config_yaml path_to_output_dir path_to_3d_future_pickled_data path_to_floor_plan_texture_images --weight_file path_to_weight_file

where the argument --weight_file specifies the path to a trained model and the argument path_to_config_yaml defines the path to the config file used to train that particular model. Also for this script, please make sure that the encoding type in the config file has also the word eval in it. By default this script randomly selects a room from the test set and the user can either choose to remove some objects or keep it unchanged. Subsequently, the user needs to specify the acceptable positions to place an object using 6 comma seperated numbers that define the bounding box of the valid positions. Similar to the previous scripts, it is possible to select a particular scene by choosing specific room via the --scene_id argument.

In the output directory, the script generates two folders in each run, one that contains the mesh files of the initial scene and another one that contains the mesh files of the completed scene with the suggested object.

Failure Cases Detection and Correction

We also provide a script that performs failure cases correction on a scene that contains a problematic object. You can simply execute it by running

python path_to_config_yaml path_to_output_dir path_to_3d_future_pickled_data path_to_floor_plan_texture_images --weight_file path_to_weight_file

where the argument --weight_file specifies the path to a trained model and the argument path_to_config_yaml defines the path to the config file used to train that particular model. Also for this script, please make sure that the encoding type in the config file has also the word eval in it. By default this script randomly selects a room from the test set and the user needs to select an object inside the room that will be located in an unnatural position. Given the scene with the unnatural position, our model identifies the problematic object and repositions it in a more plausible position.

In the output directory, the script generates two folders in each run, one that contains the mesh files of the initial scene with the problematic object and another one that contains the mesh files of the new scene.


Finally, to train a new network from scratch, we provide the script. To execute this script, you need to specify the path to the configuration file you wish to use and the path to the output directory, where the trained models and the training statistics will be saved. Namely, to train a new model from scratch, you simply need to run

python path_to_config_yaml path_to_output_dir

Note that it is also possible to start from a previously trained model by specifying the --weight_file argument, which should contain the path to a previously trained model.

Note that, if you want to use the RAdam optimizer during training, you will have to also install to download and install the corresponding code from this repository.

We also provide the option to log the experiment's evolution using Weights & Biases. To do that, you simply need to set the --with_wandb_logger argument and of course to have installed wandb in your conda environment.

Relevant Research

Please also check out the following papers that explore similar ideas:

  • Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models pdf
  • Sceneformer: Indoor Scene Generation with Transformers pdf
PyTorch-Multi-Style-Transfer - Neural Style and MSG-Net

PyTorch-Style-Transfer This repo provides PyTorch Implementation of MSG-Net (ours) and Neural Style (Gatys et al. CVPR 2016), which has been included

Hang Zhang 906 Jan 04, 2023
Locally cache assets that are normally streamed in POPULATION: ONE

Population One Localizer This is no longer needed as of the build shipped on 03/03/22, thank you bigbox :) Locally cache assets that are normally stre

Ahman Woods 2 Mar 04, 2022
PyTorch code of paper "LiVLR: A Lightweight Visual-Linguistic Reasoning Framework for Video Question Answering"

LiVLR-VideoQA We propose a Lightweight Visual-Linguistic Reasoning framework (LiVLR) for VideoQA. The overview of LiVLR: Evaluation on MSRVTT-QA Datas

JJ Jiang 7 Dec 30, 2022
Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper

Continual Learning With Filter Atom Swapping Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper If find t

11 Aug 29, 2022
SparseML is a libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

SparseML is a toolkit that includes APIs, CLIs, scripts and libraries that apply state-of-the-art sparsification algorithms such as pruning and quantization to any neural network. General, recipe-dri

Neural Magic 1.5k Dec 30, 2022
Meta Learning for Semi-Supervised Few-Shot Classification

few-shot-ssl-public Code for paper Meta-Learning for Semi-Supervised Few-Shot Classification. [arxiv] Dependencies cv2 numpy pandas python 2.7 / 3.5+

Mengye Ren 501 Jan 08, 2023
Source code for ZePHyR: Zero-shot Pose Hypothesis Rating @ ICRA 2021

ZePHyR: Zero-shot Pose Hypothesis Rating ZePHyR is a zero-shot 6D object pose estimation pipeline. The core is a learned scoring function that compare

R-Pad - Robots Perceiving and Doing 18 Aug 22, 2022
A scanpy extension to analyse single-cell TCR and BCR data.

Scirpy: A Scanpy extension for analyzing single-cell immune-cell receptor sequencing data Scirpy is a scalable python-toolkit to analyse T cell recept

ICBI 145 Jan 03, 2023
Mmdet benchmark with python

mmdet_benchmark 本项目是为了研究 mmdet 推断性能瓶颈,并且对其进行优化。 配置与环境 机器配置 CPU:Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz GPU:NVIDIA GeForce RTX 3080 10GB 内存:64G 硬盘:1T

杨培文 (Yang Peiwen) 24 May 21, 2022
ADB-IP-ROTATION - Use your mobile phone to gain a temporary IP address using ADB and data tethering

ADB IP ROTATE This an Python script based on Android Debug Bridge (adb) shell sc

Dor Bismuth 2 Jul 12, 2022
A plug-and-play library for neural networks written in Python

A plug-and-play library for neural networks written in Python!

Dimos Michailidis 2 Jul 16, 2022
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training @ KDD 2020

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training Original implementation for paper GCC: Graph Contrastive Coding for Graph Neural N

THUDM 274 Dec 27, 2022
Self Driving RC Car Code

Derp Learning Derp Learning is a Python package that collects data, trains models, and then controls an RC car for track racing. Hardware You will nee

Not Karol 39 Dec 07, 2022
Transfer Learning Remote Sensing

Transfer_Learning_Remote_Sensing Simulation R codes for data generation and visualizations are in the folder simulation. Experiment: California Housin

2 Jun 21, 2022
Multiple-Object Tracking with Transformer

TransTrack: Multiple-Object Tracking with Transformer Introduction TransTrack: Multiple-Object Tracking with Transformer Models Training data Training

Peize Sun 537 Jan 04, 2023
Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

counterfactual-tpp This is a repository containing code and real data for the paper Counterfactual Temporal Point Processes. Pre-requisites This code

Networks Learning 11 Dec 09, 2022
This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT).

Dynamic-Vision-Transformer (Pytorch) This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT). Not All Ima

210 Dec 18, 2022
Collection of machine learning related notebooks to share.

ML_Notebooks Collection of machine learning related notebooks to share. Notebooks GAN_distributed_training.ipynb In this Notebook, TensorFlow's tutori

Sascha Kirch 14 Dec 22, 2022
Annotated, understandable, and visually interpretable PyTorch implementations of: VAE, BIRVAE, NSGAN, MMGAN, WGAN, WGANGP, LSGAN, DRAGAN, BEGAN, RaGAN, InfoGAN, fGAN, FisherGAN

Overview PyTorch 0.4.1 | Python 3.6.5 Annotated implementations with comparative introductions for minimax, non-saturating, wasserstein, wasserstein g

Shayne O'Brien 471 Dec 16, 2022