PyTorch code for DriveGAN: Towards a Controllable High-Quality Neural Simulation

Overview

DriveGAN: Towards a Controllable High-Quality Neural Simulation

PyTorch code for DriveGAN

DriveGAN: Towards a Controllable High-Quality Neural Simulation
Seung Wook Kim, Jonah Philion, Antonio Torralba, Sanja Fidler
CVPR (oral), 2021
[Paper] [Project Page]

Abstract: Realistic simulators are critical for training and verifying robotics systems. While most of the contemporary simulators are hand-crafted, a scaleable way to build simulators is to use machine learning to learn how the environment behaves in response to an action, directly from data. In this work, we aim to learn to simulate a dynamic environment directly in pixel-space, by watching unannotated sequences of frames and their associated action pairs. We introduce a novel high-quality neural simulator referred to as DriveGAN that achieves controllability by disentangling different components without supervision. In addition to steering controls, it also includes controls for sampling features of a scene, such as the weather as well as the location of non-player objects. Since DriveGAN is a fully differentiable simulator, it further allows for re-simulation of a given video sequence, offering an agent to drive through a recorded scene again, possibly taking different actions. We train DriveGAN on multiple datasets, including 160 hours of real-world driving data. We showcase that our approach greatly surpasses the performance of previous data-driven simulators, and allows for new features not explored before.

For business inquires, please contact [email protected]

For press and other inquireis, please contact Hector Marinez at [email protected]

Citation

  • If you found this codebase useful in your research, please cite:
@inproceedings{kim2021drivegan,
  title={DriveGAN: Towards a Controllable High-Quality Neural Simulation},
  author={Kim, Seung Wook and Philion, Jonah and Torralba, Antonio and Fidler, Sanja},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5820--5829},
  year={2021}
}

Environment Setup

This codebase is tested with Ubuntu 18.04 and python 3.6.9, but it most likely would work with other close python3 versions.

  • Clone the repository
git clone https://github.com/nv-tlabs/DriveGAN_code.git
cd DriveGAN_code
  • Install dependencies
pip install -r requirements.txt

Data

We provide a dataset derived from Carla Simulator (https://carla.org/, https://github.com/carla-simulator/carla). This dataset is distributed under Creative Commons Attribution-NonCommercial 4.0 International Public LicenseCC BY-NC 4.0

All data are stored in the following link: https://drive.google.com/drive/folders/1fGM6KVzBL9M-6r7058fqyVnNcHVnYoJ3?usp=sharing

Training

Stage 1 (VAE-GAN)

If you want to skip stage 1 training, go to the Stage 2 (Dynamics Engine) section. For stage 1 training, download {0-5}.tar.gz from the link and extract. The extracted datasets have names starting with 6405 - change their name to data1 (for 0.tar.gz) to data6 (for 5.tar.gz).

cd DriveGAN_code/latent_decoder_model
mkdir img_data && cd img_data
tar -xvzf {0-5}.tar.gz
mv 6405x data{1-6}

Then, run

./scripts/train.sh ./img_data/data1,./img_data/data2,./img_data/data3,./img_data/data4,./img_data/data5,./img_data/data6

You can monitor training progress with tensorboard in the log_dir specified in train.sh

When validation loss converges, you can now encode the dataset with the learned model (located in log_dir from training)

./scripts/encode.sh ${path to saved model} 1 0 ./img_data/data1,./img_data/data2,./img_data/data3,./img_data/data4,./img_data/data5,./img_data/data6 ../encoded_data/data

Stage 2 (Dynamics Engine)

If you did not do Stage 1 training, download encoded_data.tar.gz and vaegan_iter210000.pt from link, and extract.

cd DriveGAN_code
mkdir encoded_data
tar -xvzf encoded_data.tar.gz -C encoded_data

Otherwise, run

cd DriveGAN_code
./scripts/train.sh encoded_data/data ${path to saved vae-gan model}

Playing with trained model

If you want to skip training, download simulator_epoch1020.pt and vaegan_iter210000.pt from link.

To play with a trained model, run

./scripts/play/server.sh ${path to saved dynamics engine} ${port e.g. 8888} ${path to saved vae-gan model}

Now you can navigate to localhost:{port} on your browser (tested on Chrome) and play.

(Controls - 'w': speed up, 's': slow down, 'a': steer left, 'd': steer right)

There are also additional buttons for changing contents. To sample a new scene, simply refresh the webpage.

License

Thie codebase and trained models are distributed under Nvidia Source Code License and the dataset is distributed under CC BY-NC 4.0.

Code for VAE-GAN is adapted from https://github.com/rosinality/stylegan2-pytorch (License).

Code for Lpips is imported from https://github.com/richzhang/PerceptualSimilarity (License).

StyleGAN custom ops are imported from https://github.com/NVlabs/stylegan2 (License).

Interactive UI code uses http://www.semantic-ui.com/ (License).

Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling"

Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling" Pipeline of Tip-Adapter Tip-Adapter can provid

peng gao 187 Dec 28, 2022
Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

EfficientZero (NeurIPS 2021) Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021. Environments Effi

Weirui Ye 671 Jan 03, 2023
Code, pre-trained models and saliency results for the paper "Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images".

Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB This repository is the official implementation of the paper. Our results comming soon in

Xiaoqiang Wang 8 May 22, 2022
A python program to hack instagram

hackinsta a program to hack instagram Yokoback_(instahack) is the file to open, you need libraries write on import. You run that file in the same fold

2 Jan 22, 2022
Code repository for our paper regarding the L3D dataset.

The Large Labelled Logo Dataset (L3D): A Multipurpose and Hand-Labelled Continuously Growing Dataset Website: https://lhf-labs.github.io/tm-dataset Da

LHF Labs 9 Dec 14, 2022
DLL: Direct Lidar Localization

DLL: Direct Lidar Localization Summary This package presents DLL, a direct map-based localization technique using 3D LIDAR for its application to aeri

Service Robotics Lab 127 Dec 16, 2022
X-modaler is a versatile and high-performance codebase for cross-modal analytics.

X-modaler X-modaler is a versatile and high-performance codebase for cross-modal analytics. This codebase unifies comprehensive high-quality modules i

910 Dec 28, 2022
Code to train models from "Paraphrastic Representations at Scale".

Paraphrastic Representations at Scale Code to train models from "Paraphrastic Representations at Scale". The code is written in Python 3.7 and require

John Wieting 71 Dec 19, 2022
This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

212 Dec 25, 2022
TensorFlow 2 AI/ML library wrapper for openFrameworks

ofxTensorFlow2 This is an openFrameworks addon for the TensorFlow 2 ML (Machine Learning) library

Center for Art and Media Karlsruhe 96 Dec 31, 2022
Apache Flink

Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flin

The Apache Software Foundation 20.4k Dec 30, 2022
A Gura parser implementation for Python

Gura Python parser This repository contains the implementation of a Gura (compliant with version 1.0.0) format parser in Python. Installation pip inst

Gura Config Lang 19 Jan 25, 2022
Code to accompany the paper "Finding Bipartite Components in Hypergraphs", which is published in NeurIPS'21.

Finding Bipartite Components in Hypergraphs This repository contains code to accompany the paper "Finding Bipartite Components in Hypergraphs", publis

Peter Macgregor 5 May 06, 2022
Code for Multinomial Diffusion

Code for Multinomial Diffusion Abstract Generative flows and diffusion models have been predominantly trained on ordinal data, for example natural ima

104 Jan 04, 2023
Instant neural graphics primitives: lightning fast NeRF and more

Instant Neural Graphics Primitives Ever wanted to train a NeRF model of a fox in under 5 seconds? Or fly around a scene captured from photos of a fact

NVIDIA Research Projects 10.6k Jan 01, 2023
OBBDetection is a oriented object detection library, which is based on MMdetection.

OBBDetection news: We are now updating OBBDetection to new vision based on MMdetection v2.10, which has more advanced models and more efficient featur

jbwang1997 401 Jan 02, 2023
This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes.

Rotate-Yolov5 This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes. Section I. Description The codes are

xinzelee 90 Dec 13, 2022
Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible, to be the most reliable with the least com

Nikolas B Virionis 2 Aug 01, 2022
Data-driven reduced order modeling for nonlinear dynamical systems

SSMLearn Data-driven Reduced Order Models for Nonlinear Dynamical Systems This package perform data-driven identification of reduced order model based

Haller Group, Nonlinear Dynamics 27 Dec 13, 2022
SuRE Evaluation: A Supplementary Material

SuRE Evaluation: A Supplementary Material This repository contains supplementary material regarding the evaluations presented in the paper Visual Expl

NYU Visualization Lab 0 Dec 14, 2021