PyTorch implementation of Value Iteration Networks (VIN): Clean, Simple and Modular. Visualization in Visdom.

Overview

VIN: Value Iteration Networks

This is an implementation of Value Iteration Networks (VIN) in PyTorch to reproduce the results.(TensorFlow version)

Architecture of Value Iteration Network

Key idea

  • A fully differentiable neural network with a 'planning' sub-module.
  • Value Iteration = Conv Layer + Channel-wise Max Pooling
  • Generalize better than reactive policies for new, unseen tasks.

Learned Reward Image and Its Value Images for each VI Iteration

Visualization Grid world Reward Image Value Images
8x8
16x16
28x28

Dependencies

This repository requires following packages:

  • Python >= 3.6
  • Numpy >= 1.12.1
  • PyTorch >= 0.1.10
  • SciPy >= 0.19.0
  • visdom >= 0.1

Datasets

Each data sample consists of (x, y) coordinates of current state in grid world, followed by an obstacle image and a goal image.

Dataset size 8x8 16x16 28x28
Train set 77760 776440 4510695
Test set 12960 129440 751905

Running Experiment: Training

Grid world 8x8

python run.py --datafile data/gridworld_8x8.npz --imsize 8 --lr 0.005 --epochs 30 --k 10 --batch_size 128

Grid world 16x16

python run.py --datafile data/gridworld_16x16.npz --imsize 16 --lr 0.008 --epochs 30 --k 20 --batch_size 128

Grid world 28x28

python run.py --datafile data/gridworld_28x28.npz --imsize 28 --lr 0.003 --epochs 30 --k 36 --batch_size 128

Flags:

  • datafile: The path to the data files.
  • imsize: The size of input images. From: [8, 16, 28]
  • lr: Learning rate with RMSProp optimizer. Recommended: [0.01, 0.005, 0.002, 0.001]
  • epochs: Number of epochs to train. Default: 30
  • k: Number of Value Iterations. Recommended: [10 for 8x8, 20 for 16x16, 36 for 28x28]
  • ch_i: Number of channels in input layer. Default: 2, i.e. obstacles image and goal image.
  • ch_h: Number of channels in first convolutional layer. Default: 150, described in paper.
  • ch_q: Number of channels in q layer (~actions) in VI-module. Default: 10, described in paper.
  • batch_size: Batch size. Default: 128

Visualization with Visdom

We shall visualize the learned reward image and its corresponding value images for each VI iteration by using visdom.

Firstly start the server

python -m visdom.server

Open Visdom in browser in http://localhost:8097

Then run following to visualize learn reward and value images.

python vis.py --datafile learned_rewards_values_28x28.npz

NOTE: If you would like to produce GIF animation of value images on your own, the following command might be useful.

convert -delay 20 -loop 0 *.png value_function.gif

Benchmarks

GPU: TITAN X

Performance: Test Accuracy

NOTE: This is the accuracy on test set. It is different from the table in the paper, which indicates the success rate from rollouts of the learned policy in the environment.

Test Accuracy 8x8 16x16 28x28
PyTorch 99.16% 92.44% 88.20%
TensorFlow 99.03% 90.2% 82%

Speed with GPU

Speed per epoch 8x8 16x16 28x28
PyTorch 3s 15s 100s
TensorFlow 4s 25s 165s

Frequently Asked Questions

  • Q: How to get reward image from observation ?

    • A: Observation image has 2 channels. First channel is obstacle image (0: free, 1: obstacle). Second channel is goal image (0: free, 10: goal). For example, in 8x8 grid world, the shape of an input tensor with batch size 128 is [128, 2, 8, 8]. Then it is fed into a convolutional layer with [3, 3] filter and 150 feature maps, followed by another convolutional layer with [3, 3] filter and 1 feature map. The shape of the output tensor is [128, 1, 8, 8]. This is the reward image.
  • Q: What is exactly transition model, and how to obtain value image by VI-module from reward image ?

    • A: Let us assume batch size is 128 under 8x8 grid world. Once we obtain the reward image with shape [128, 1, 8, 8], we do convolutional layer for q layers in VI module. The [3, 3] filter represents the transition probabilities. There is a set of 10 filters, each for generating a feature map in q layers. Each feature map corresponds to an "action". Note that this is larger than real available actions which is only 8. Then we do a channel-wise Max Pooling to obtain the value image with shape [128, 1, 8, 8]. Finally we stack this value image with reward image for a new VI iteration.

References

Further Readings

Owner
Xingdong Zuo
AI in well-being is my dream. Neural networks need to understand the world causally.
Xingdong Zuo
Code for "My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack" paper

Myo Keylogging This is the source code for our paper My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack by Matthias Ga

Secure Mobile Networking Lab 7 Jan 03, 2023
Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training Code for our paper "Predicting lncRNA–protein interactio

zhanglabNKU 1 Nov 29, 2022
PyDeepFakeDet is an integrated and scalable tool for Deepfake detection.

PyDeepFakeDet An integrated and scalable library for Deepfake detection research. Introduction PyDeepFakeDet is an integrated and scalable Deepfake de

Junke, Wang 49 Dec 11, 2022
QuALITY: Question Answering with Long Input Texts, Yes!

QuALITY: Question Answering with Long Input Texts, Yes! Authors: Richard Yuanzhe Pang,* Alicia Parrish,* Nitish Joshi,* Nikita Nangia, Jason Phang, An

ML² AT CILVR 61 Jan 02, 2023
Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).

SSAN Introduction This is the pytorch implementation of the SSAN model (see our AAAI2021 paper: Entity Structure Within and Throughout: Modeling Menti

benfeng 69 Nov 15, 2022
A novel pipeline framework for multi-hop complex KGQA task. About the paper title: Improving Multi-hop Embedded Knowledge Graph Question Answering by Introducing Relational Chain Reasoning

Rce-KGQA A novel pipeline framework for multi-hop complex KGQA task. This framework mainly contains two modules, answering_filtering_module and relati

金伟强 -上海大学人工智能小渣渣~ 16 Nov 18, 2022
Dataset Condensation with Contrastive Signals

Dataset Condensation with Contrastive Signals This repository is the official implementation of Dataset Condensation with Contrastive Signals (DCC). T

3 May 19, 2022
[ECE NTUA] 👁 Computer Vision - Lab Projects & Theoretical Problem Sets (2020-2021)

Computer Vision - NTUA (2020-2021) This repository hosts the lab projects and theoretical problem sets of the Computer Vision course held by ECE NTUA

Dimitris Dimos 6 Jul 21, 2022
An API-first distributed deployment system of deep learning models using timeseries data to analyze and predict systems behaviour

Gordo Building thousands of models with timeseries data to monitor systems. Table of content About Examples Install Uninstall Developer manual How to

Equinor 26 Dec 27, 2022
NaturalProofs: Mathematical Theorem Proving in Natural Language

NaturalProofs: Mathematical Theorem Proving in Natural Language NaturalProofs: Mathematical Theorem Proving in Natural Language Sean Welleck, Jiacheng

Sean Welleck 83 Jan 05, 2023
3D detection and tracking viewer (visualization) for kitti & waymo dataset

3D detection and tracking viewer (visualization) for kitti & waymo dataset

222 Jan 08, 2023
Iterative Normalization: Beyond Standardization towards Efficient Whitening

IterNorm Code for reproducing the results in the following paper: Iterative Normalization: Beyond Standardization towards Efficient Whitening Lei Huan

Lei Huang 21 Dec 27, 2022
A curated list of neural rendering resources.

Awesome-of-Neural-Rendering A curated list of neural rendering and related resources. Please feel free to pull requests or open an issue to add papers

Zhiwei ZHANG 43 Dec 09, 2022
A way to store images in YAML.

YAMLImg A way to store images in YAML. I made this after seeing Roadcrosser's JSON-G because it was too inspiring to ignore this opportunity. Installa

5 Mar 14, 2022
The Submission for SIMMC 2.0 Challenge 2021

The Submission for SIMMC 2.0 Challenge 2021 challenge website Requirements python 3.8.8 pytorch 1.8.1 transformers 4.8.2 apex for multi-gpu nltk Prepr

5 Jul 26, 2022
RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE - Real Time Video Interpolation arXiv | YouTube | Colab | Tutorial | Demo Table of Contents Introduction Collection Usage Evaluation Training and

hzwer 3k Jan 04, 2023
Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction"

To run a generation experiment (either conceptnet or atomic), follow these instructions: First Steps First clone, the repo: git clone https://github.c

Antoine Bosselut 575 Jan 01, 2023
[NeurIPS 2021] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects

[NeurIPS 2021] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects YouTube | arXiv Prerequisites Kaolin is available here:

Denys Rozumnyi 107 Dec 26, 2022
TuckER: Tensor Factorization for Knowledge Graph Completion

TuckER: Tensor Factorization for Knowledge Graph Completion This codebase contains PyTorch implementation of the paper: TuckER: Tensor Factorization f

Ivana Balazevic 296 Dec 06, 2022
Python package to add text to images, textures and different backgrounds

nider Python package for text images generation and watermarking Free software: MIT license Documentation: https://nider.readthedocs.io. nider is an a

Vladyslav Ovchynnykov 131 Dec 30, 2022