PyTorch implementation of Value Iteration Networks (VIN): Clean, Simple and Modular. Visualization in Visdom.

Overview

VIN: Value Iteration Networks

This is an implementation of Value Iteration Networks (VIN) in PyTorch to reproduce the results.(TensorFlow version)

Architecture of Value Iteration Network

Key idea

  • A fully differentiable neural network with a 'planning' sub-module.
  • Value Iteration = Conv Layer + Channel-wise Max Pooling
  • Generalize better than reactive policies for new, unseen tasks.

Learned Reward Image and Its Value Images for each VI Iteration

Visualization Grid world Reward Image Value Images
8x8
16x16
28x28

Dependencies

This repository requires following packages:

  • Python >= 3.6
  • Numpy >= 1.12.1
  • PyTorch >= 0.1.10
  • SciPy >= 0.19.0
  • visdom >= 0.1

Datasets

Each data sample consists of (x, y) coordinates of current state in grid world, followed by an obstacle image and a goal image.

Dataset size 8x8 16x16 28x28
Train set 77760 776440 4510695
Test set 12960 129440 751905

Running Experiment: Training

Grid world 8x8

python run.py --datafile data/gridworld_8x8.npz --imsize 8 --lr 0.005 --epochs 30 --k 10 --batch_size 128

Grid world 16x16

python run.py --datafile data/gridworld_16x16.npz --imsize 16 --lr 0.008 --epochs 30 --k 20 --batch_size 128

Grid world 28x28

python run.py --datafile data/gridworld_28x28.npz --imsize 28 --lr 0.003 --epochs 30 --k 36 --batch_size 128

Flags:

  • datafile: The path to the data files.
  • imsize: The size of input images. From: [8, 16, 28]
  • lr: Learning rate with RMSProp optimizer. Recommended: [0.01, 0.005, 0.002, 0.001]
  • epochs: Number of epochs to train. Default: 30
  • k: Number of Value Iterations. Recommended: [10 for 8x8, 20 for 16x16, 36 for 28x28]
  • ch_i: Number of channels in input layer. Default: 2, i.e. obstacles image and goal image.
  • ch_h: Number of channels in first convolutional layer. Default: 150, described in paper.
  • ch_q: Number of channels in q layer (~actions) in VI-module. Default: 10, described in paper.
  • batch_size: Batch size. Default: 128

Visualization with Visdom

We shall visualize the learned reward image and its corresponding value images for each VI iteration by using visdom.

Firstly start the server

python -m visdom.server

Open Visdom in browser in http://localhost:8097

Then run following to visualize learn reward and value images.

python vis.py --datafile learned_rewards_values_28x28.npz

NOTE: If you would like to produce GIF animation of value images on your own, the following command might be useful.

convert -delay 20 -loop 0 *.png value_function.gif

Benchmarks

GPU: TITAN X

Performance: Test Accuracy

NOTE: This is the accuracy on test set. It is different from the table in the paper, which indicates the success rate from rollouts of the learned policy in the environment.

Test Accuracy 8x8 16x16 28x28
PyTorch 99.16% 92.44% 88.20%
TensorFlow 99.03% 90.2% 82%

Speed with GPU

Speed per epoch 8x8 16x16 28x28
PyTorch 3s 15s 100s
TensorFlow 4s 25s 165s

Frequently Asked Questions

  • Q: How to get reward image from observation ?

    • A: Observation image has 2 channels. First channel is obstacle image (0: free, 1: obstacle). Second channel is goal image (0: free, 10: goal). For example, in 8x8 grid world, the shape of an input tensor with batch size 128 is [128, 2, 8, 8]. Then it is fed into a convolutional layer with [3, 3] filter and 150 feature maps, followed by another convolutional layer with [3, 3] filter and 1 feature map. The shape of the output tensor is [128, 1, 8, 8]. This is the reward image.
  • Q: What is exactly transition model, and how to obtain value image by VI-module from reward image ?

    • A: Let us assume batch size is 128 under 8x8 grid world. Once we obtain the reward image with shape [128, 1, 8, 8], we do convolutional layer for q layers in VI module. The [3, 3] filter represents the transition probabilities. There is a set of 10 filters, each for generating a feature map in q layers. Each feature map corresponds to an "action". Note that this is larger than real available actions which is only 8. Then we do a channel-wise Max Pooling to obtain the value image with shape [128, 1, 8, 8]. Finally we stack this value image with reward image for a new VI iteration.

References

Further Readings

Owner
Xingdong Zuo
AI in well-being is my dream. Neural networks need to understand the world causally.
Xingdong Zuo
a reimplementation of Holistically-Nested Edge Detection in PyTorch

pytorch-hed This is a personal reimplementation of Holistically-Nested Edge Detection [1] using PyTorch. Should you be making use of this work, please

Simon Niklaus 375 Dec 06, 2022
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

UC2 UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu,

Mingyang Zhou 28 Dec 30, 2022
DziriBERT: a Pre-trained Language Model for the Algerian Dialect

DziriBERT DziriBERT is the first Transformer-based Language Model that has been pre-trained specifically for the Algerian Dialect. It handles Algerian

117 Jan 07, 2023
Bare bones use-case for deploying a containerized web app (built in streamlit) on AWS.

Containerized Streamlit web app This repository is featured in a 3-part series on Deploying web apps with Streamlit, Docker, and AWS. Checkout the blo

Collin Prather 62 Jan 02, 2023
Procedural 3D data generation pipeline for architecture

Synthetic Dataset Generator Authors: Stanislava Fedorova Alberto Tono Meher Shashwat Nigam Jiayao Zhang Amirhossein Ahmadnia Cecilia bolognesi Dominik

Computational Design Institute 49 Nov 25, 2022
Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

ATLOP Code for AAAI 2021 paper Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. If you make use of this co

Wenxuan Zhou 146 Nov 29, 2022
code for ICCV 2021 paper 'Generalized Source-free Domain Adaptation'

G-SFDA Code (based on pytorch 1.3) for our ICCV 2021 paper 'Generalized Source-free Domain Adaptation'. [project] [paper]. Dataset preparing Download

Shiqi Yang 84 Dec 26, 2022
[ICCV'21] Pri3D: Can 3D Priors Help 2D Representation Learning?

Pri3D: Can 3D Priors Help 2D Representation Learning? [ICCV 2021] Pri3D leverages 3D priors for downstream 2D image understanding tasks: during pre-tr

Ji Hou 124 Jan 06, 2023
Official pytorch implementation of paper Dual-Level Collaborative Transformer for Image Captioning (AAAI 2021).

Dual-Level Collaborative Transformer for Image Captioning This repository contains the reference code for the paper Dual-Level Collaborative Transform

lyricpoem 160 Dec 11, 2022
Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

Bo Wan 32 Dec 23, 2022
[ECCVW2020] Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DiMP)

Feel free to visit my homepage Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DIMP) [ECCVW2020 paper] Presentation

Seokeon Choi 35 Oct 26, 2022
(under submission) Bayesian Integration of a Generative Prior for Image Restoration

BIGPrior: Towards Decoupling Learned Prior Hallucination and Data Fidelity in Image Restoration Authors: Majed El Helou, and Sabine Süsstrunk {Note: p

Majed El Helou 22 Dec 17, 2022
Restricted Boltzmann Machines in Python.

How to Use First, initialize an RBM with the desired number of visible and hidden units. rbm = RBM(num_visible = 6, num_hidden = 2) Next, train the m

Edwin Chen 928 Dec 30, 2022
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling For Official repo of NU-Wave: A Diffusion Probabilistic Model for Neural Audio Up

Rishikesh (ऋषिकेश) 38 Oct 11, 2022
Official code for "Maximum Likelihood Training of Score-Based Diffusion Models", NeurIPS 2021 (spotlight)

Maximum Likelihood Training of Score-Based Diffusion Models This repo contains the official implementation for the paper Maximum Likelihood Training o

Yang Song 84 Dec 12, 2022
[ArXiv 2021] One-Shot Generative Domain Adaptation

GenDA - One-Shot Generative Domain Adaptation One-Shot Generative Domain Adaptation Ceyuan Yang*, Yujun Shen*, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Z

GenForce: May Generative Force Be with You 46 Dec 19, 2022
Code for the paper "Attention Approximates Sparse Distributed Memory"

Attention Approximates Sparse Distributed Memory - Codebase This is all of the code used to run analyses in the paper "Attention Approximates Sparse D

Trenton Bricken 14 Dec 05, 2022
Prometheus exporter for Cisco Unified Computing System (UCS) Manager

prometheus-ucs-exporter Overview Use metrics from the UCS API to export relevant metrics to Prometheus This repository is a fork of Drew Stinnett's or

Marshall Wace 6 Nov 07, 2022
PyKaldi GOP-DNN on Epa-DB

PyKaldi GOP-DNN on Epa-DB This repository has the tools to run a PyKaldi GOP-DNN algorithm on Epa-DB, a database of non-native English speech by Spani

18 Dec 14, 2022
Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

61 Jan 07, 2023