Multi-Stage Episodic Control for Strategic Exploration in Text Games

Last update: May 24, 2022

Overview

XTX: eXploit - Then - eXplore

Requirements

First clone this repo using git clone https://github.com/princeton-nlp/XTX.git

Please create two conda environments as follows:

conda env create -f yml_envs/jericho-wt.yml
a. conda activate jericho-wt
b. pip install git+https://github.com/jens321/[email protected]
conda env create -f yml_envs/jericho-no-wt.yml

The first set of commands will create a conda environment called jericho-wt which has added actions to the game grammar for specific games (see games with * in the paper). The second command will create another conda environment called jericho-no-wt which installs an unmodified version of the Jericho library.

Training

All code can be run from the root folder of this project. Please follow the commands below for each specific model:

XTX: sh scripts/run_xtx.sh
XTX (no-mix): sh scripts/run_xtx_no_mix.sh
XTX (uniform): sh scrtips/run_xtx_uniform.sh
XTX ($\lambda$ = 0, 0.5, or 1): sh scripts/run_xtx_ablation.sh
INV DY: sh scripts/run_inv_dy.sh
DRRN: sh scripts/run_drrn.sh

Notes

You can use analysis/sample_env.py for quickly playing around with a sample Jericho environment. Run it using python3 -m analysis.sample_env.
You can use analysis/augment_wt.py for generating the missing action candidates that can be added to the game grammar (games with * in the paper). Run it using python3 -m analysis.augment_wt.
Note that all models should finish within a day or two given 1 gpu and 8 cpus, except for games where Jericho's valid action handicap is slow (e.g. Library, Dragon). Since Jericho's valid action handicap heavily relies on parallelization, increasing the number of cpus also results in good speedups (e.g. 8 -> 16).

Acknowledgements

We used Weights & Biases for experiment tracking and visualizations to develop insights for this paper.

Some of the code borrows from the TDQN repo.

For any questions please contact Jens Tuyls ([email protected]).

Multi-Stage Episodic Control for Strategic Exploration in Text Games

Related tags

Overview

XTX: eXploit - Then - eXplore

Requirements

Training

Notes

Acknowledgements

Owner

Princeton Natural Language Processing

🔅 Shapash makes Machine Learning models transparent and understandable by everyone

Python interface for SmartRF Sniffer 2 Firmware

👨‍💻 run nanosaur in simulation with Gazebo/Ingnition

Benchmark VAE - Library for Variational Autoencoder benchmarking

Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Fast Soft Color Segmentation

Asterisk is a framework to generate high-quality training datasets at scale

Tensorflow implementation of DeepLabv2

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

Easy and Efficient Object Detector

Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021

Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training".

Using Hotel Data to predict High Value And Potential VIP Guests

ADOP: Approximate Differentiable One-Pixel Point Rendering

Python wrappers to the C++ library SymEngine, a fast C++ symbolic manipulation library.

The repository offers the official implementation of our BMVC 2021 paper in PyTorch.

A python-image-classification web application project, written in Python and served through the Flask Microframework

Ganilla - Official Pytorch implementation of GANILLA

QT Py Media Knob using rotary encoder & neopixel ring

2D&3D human pose estimation