World Models with TensorFlow 2

Last update: Nov 30, 2022

Related tags

Overview

World Models

This repo reproduces the original implementation of World Models. This implementation uses TensorFlow 2.2.

Docker

The easiest way to handle dependencies is with Nvidia-Docker. Follow the instructions below to generate and attach to the container.

docker image build -t wm:1.0 -f docker/Dockerfile.wm .
docker container run -p 8888:8888 --gpus '"device=0"' --detach -it --name wm wm:1.0
docker attach wm

Visualizations

To visualize the environment from the agents perspective or generate synthetic observations use the visualizations jupyter notebook. It can be launched from your container with the following:

jupyter notebook --no-browser --port=8888 --ip=0.0.0.0 --allow-root

Real Frame Sample	Reconstructed Real Frame	Imagined Frame

Ground Truth (CarRacing)	Reconstructed

Ground Truth Environment (DoomTakeCover)	Dream Environment

Reproducing Results From Scratch

These instructions assume a machine with a 64 core cpu and a gpu. If running in the cloud it will likely financially make more sense to run the extraction and controller processes on a cpu machine and the VAE, preprocessing, and RNN tasks on a GPU machine.

DoomTakeCover-v0

CAUTION The doom environment leaves some processes hanging around. In addition to running the doom experiments, the script kills processes including 'vizdoom' in the name (be careful with this if you are not running in a container). To reproduce results for DoomTakeCover-v0 run the following bash script.

bash launch_scripts/wm_doom.bash

CarRacing-v0

To reproduce results for CarRacing-v0 run the following bash script

bash launch_scripts/carracing.bash

Disclaimer

I have not run this for long enough(~45 days wall clock time) to verify that we produce the same results on CarRacing-v0 as the original implementation.

Average return curves comparing the original implementation and ours. The shaded area represents a standard deviation above and below the mean.

For simplicity, the Doom experiment implementation is slightly different than the original

We do not use weighted cross entropy loss for done predictions
We train the RNN with sequences that always begin at the start of an episode (as opposed to random subsequences)
We sample whether the agent dies (as opposed to a deterministic cut-off)

	\tau	Returns Dream Environment	Returns Actual Environment
D. Ha Original	1.0	1145 +/- 690	868 +/- 511
Eager	1.0	1465 +/- 633	849 +/- 499

World Models with TensorFlow 2

Related tags

Overview

World Models

Docker

Visualizations

Reproducing Results From Scratch

DoomTakeCover-v0

CarRacing-v0

Disclaimer

Owner

Zac Wellmer

Torch-ngp - A pytorch implementation of the hash encoder proposed in instant-ngp

Learning Synthetic Environments and Reward Networks for Reinforcement Learning

Code for the paper "Generative design of breakwaters usign deep convolutional neural network as a surrogate model"

基于Paddle框架的arcface复现

labelpix is a graphical image labeling interface for drawing bounding boxes

Explainability of the Implications of Supervised and Unsupervised Face Image Quality Estimations Through Activation Map Variation Analyses in Face Recognition Models

FinRL-Meta: A Universe for Data-Driven Financial Reinforcement Learning. 🔥

Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

SOTA easy to use PyTorch-based DL training library

Auto White-Balance Correction for Mixed-Illuminant Scenes

PyTorch implementation of ENet

Shuwa Gesture Toolkit is a framework that detects and classifies arbitrary gestures in short videos

A semismooth Newton method for elliptic PDE-constrained optimization

The UI as a mobile display for OP25

OverFeat is a Convolutional Network-based image classifier and feature extractor.

Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers

AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation

Pseudo-rng-app - whos needs science to make a random number when you have pseudoscience?

World Models with TensorFlow 2

Related tags

Overview

World Models

Docker

Visualizations

Reproducing Results From Scratch

DoomTakeCover-v0

CarRacing-v0

Disclaimer

Owner

Zac Wellmer

Torch-ngp - A pytorch implementation of the hash encoder proposed in instant-ngp

Learning Synthetic Environments and Reward Networks for Reinforcement Learning

Code for the paper "Generative design of breakwaters usign deep convolutional neural network as a surrogate model"

基于Paddle框架的arcface复现

labelpix is a graphical image labeling interface for drawing bounding boxes

Explainability of the Implications of Supervised and Unsupervised Face Image Quality Estimations Through Activation Map Variation Analyses in Face Recognition Models

FinRL­-Meta: A Universe for Data­-Driven Financial Reinforcement Learning. 🔥

Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

SOTA easy to use PyTorch-based DL training library

Auto White-Balance Correction for Mixed-Illuminant Scenes

PyTorch implementation of ENet

Shuwa Gesture Toolkit is a framework that detects and classifies arbitrary gestures in short videos

A semismooth Newton method for elliptic PDE-constrained optimization

The UI as a mobile display for OP25

OverFeat is a Convolutional Network-based image classifier and feature extractor.

Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers

AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation

Pseudo-rng-app - whos needs science to make a random number when you have pseudoscience?

FinRL-Meta: A Universe for Data-Driven Financial Reinforcement Learning. 🔥