PyTorch implementation for paper StARformer: Transformer with State-Action-Reward Representations.

Last update: Dec 09, 2022

Related tags

Overview

StARformer

This repository contains the PyTorch implementation for our paper titled StARformer: Transformer with State-Action-Reward Representations. We learn local State-Action-Reward representations (StAR-representations) to improve (long) sequence modeling for reinforcement learning (and imitation learning).

Results

Installation

Dependencies can be installed by Conda:

conda env create -f my_env.yml

And install Atari ROMs.

Datasets

Please follow this instruction for datasets.

Example usage

See run.sh or below:

python run_star_atari.py --seed 123 --data_dir_prefix [data_directory] --epochs 10 --num_steps 500000 --num_buffers 50 --batch_size 64 --seq_len 30 --model_type 'star' --game 'Breakout'

[data_directory] is where you place the Atari dataset.

Variants (`model_type`):

'star' (imitation)
'star_rwd' (offline RL)
'star_fusion' (see Figure 4a in our paper)
'star_stack' (see Figure 4b in our paper)

Acknowledgement

This code is based on Decision-Transformer.

PyTorch implementation for paper StARformer: Transformer with State-Action-Reward Representations.

Related tags

Overview

StARformer

Results

Installation

Datasets

Example usage

Variants (`model_type`):

Acknowledgement

Owner

Jinghuan Shang

For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

Composable transformations of Python+NumPy programsComposable transformations of Python+NumPy programs

Neural style transfer in PyTorch.

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Keras documentation, hosted live at keras.io

Calculates carbon footprint based on fuel mix and discharge profile at the utility selected. Can create graphs and tabular output for fuel mix based on input file of series of power drawn over a period of time.

MPRNet-Cloud-removal: Progressive cloud removal

CoMoGAN: continuous model-guided image-to-image translation. CVPR 2021 oral.

Code for 1st place solution in Sleep AI Challenge SNU Hospital

Code for our paper "Interactive Analysis of CNN Robustness"

NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)

Human segmentation models, training/inference code, and trained weights, implemented in PyTorch

Diverse Image Generation via Self-Conditioned GANs

Gym environment for FLIPIT: The Game of "Stealthy Takeover"

[ICML 2022] The official implementation of Graph Stochastic Attention (GSAT).

Fair Recommendation in Two-Sided Platforms

Self-supervised Multi-modal Hybrid Fusion Network for Brain Tumor Segmentation

Open-Domain Question-Answering for COVID-19 and Other Emergent Domains

A python module for configuration of block devices

BboxToolkit is a tiny library of special bounding boxes.

PyTorch implementation for paper StARformer: Transformer with State-Action-Reward Representations.

Related tags

Overview

StARformer

Results

Installation

Datasets

Example usage

Variants (model_type):

Acknowledgement

Owner

Jinghuan Shang

For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

Composable transformations of Python+NumPy programsComposable transformations of Python+NumPy programs

Neural style transfer in PyTorch.

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Keras documentation, hosted live at keras.io

Calculates carbon footprint based on fuel mix and discharge profile at the utility selected. Can create graphs and tabular output for fuel mix based on input file of series of power drawn over a period of time.

MPRNet-Cloud-removal: Progressive cloud removal

CoMoGAN: continuous model-guided image-to-image translation. CVPR 2021 oral.

Code for 1st place solution in Sleep AI Challenge SNU Hospital

Code for our paper "Interactive Analysis of CNN Robustness"

NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)

Human segmentation models, training/inference code, and trained weights, implemented in PyTorch

Diverse Image Generation via Self-Conditioned GANs

Gym environment for FLIPIT: The Game of "Stealthy Takeover"

[ICML 2022] The official implementation of Graph Stochastic Attention (GSAT).

Fair Recommendation in Two-Sided Platforms

Self-supervised Multi-modal Hybrid Fusion Network for Brain Tumor Segmentation

Open-Domain Question-Answering for COVID-19 and Other Emergent Domains

A python module for configuration of block devices

BboxToolkit is a tiny library of special bounding boxes.

Variants (`model_type`):